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Data sharing will pay dividends 


As public pressure builds for drug companies to make more results available from clinical trials, 
the industry should not forget that it relies on collective goodwill to test new therapies. 


surrounding clinical trials and medical practice may find it 

bizarre that anyone could be “surprised and concerned to 
discover that information is routinely withheld from doctors and 
researchers about the methods and results of clinical trials”, as stated 
in a UK government report last week. 

After all, campaigning doctors have warned for years that phar- 
maceutical companies have in the past concealed data that reflected 
poorly on their drugs. Regulators — notably the London-based 
European Medicines Agency — have pushed for more information 
to be released into the public domain. And the drug industry itself 
has moved to open its private data vaults, albeit not by as much or as 
quickly as campaigners would like. 

The ‘surprise’ and ‘concerr at this well-chronicled behaviour of 
big pharma comes ina report from the UK Parliament’s Committee 
of Public Accounts, a cross-party group with the remit to scrutinize 
whatever public-expenditure topic catches its eye. The report focuses 
on the antiviral drug Tamiflu (oseltamivir) — used to treat influenza 
— which the UK government and others have stockpiled at great 
expense owing to concerns about pandemic flu. 

Independent scientists who want to investigate whether Tamiflu 
works have struggled to find all the information they need, in part 
because its manufacturer Roche, headquartered in Basel, Switzerland, 
held back some details of trials. The company says that it has now 
released all its Tamiflu data, and researchers are working their way 
through them, but the case has become a high-profile example of the 
need for greater openness in biomedical science. 

The report made headlines in the United Kingdom and, according 
to campaigners, vindicated their position. Its call for the UK govern- 
ment to improve the availability of clinical-trial information is also in 
line with initiatives elsewhere. 

For instance, the European Medicines Agency is already pushing 
forward plans to release the clinical-trial data that drug companies 
submit to it when seeking approval for their products. Legal action 
taken against the agency by two aggrieved companies has slowed pro- 
gress. But in December, the body again stated its “firm commitment to 
pursuing the objective of full transparency regarding clinical trial data”. 

Meanwhile, the European Union is overhauling its clinical-trial 
legislation. Final agreement is some months away, but it seems likely 
that pharmaceutical firms will eventually be required to upload at least 
asummary of all their trials to a publicly accessible website. 

The two major drug-industry associations in Europe and North 
America have also moved towards openness. Their joint policy on 
access to clinical-trial data came into force at the start of this year, 
adding to promises made by individual companies such as Glaxo- 
SmithKline, Sanofi and Roche to share more data on their drugs. 

Yet critics — notably the AllTrials campaign group — say that the 
industry is not going far enough. Their main complaints are that 


R= of Nature who are familiar with recent controversies 


transparency initiatives are not retrospective and do not require the 
inclusion of data from older trials, and that some in industry want to 
act as information gatekeepers, determining which researchers have 
a genuine need for their expensively assembled data sets. These cam- 
paigners deserve credit for raising the issue and for their perseverance 

in pushing for change. 
The industry is at a crossroads. As the UK committee's report 
shows, concern over the behaviour of pharmaceutical firms is grow- 
ing. Any anger over the industry's perceived 


“Recent underhand tactics when it comes to data 
history is full transparency could spread from the vocifer- 
of examples ous — but small — community of politically 
of the public active medics and policy campaigners to the 
turning against —_ wider public. 

businesses with Drug companies perform a vital function 


for society in fighting disease and preserv- 
ing public health. But recent history is full 
of examples of the public turning against businesses with essential 
roles. From banks and energy firms to the oil industry, an increasingly 
networked and ethically aware public is now capable of dramatic and 
damaging pushbacks against disliked companies. 

It may be true that the worst practices of big pharma are behind it, 
but shades of that bad attitude linger. And past misdeeds have a habit 
of coming back to bite. Drug companies must remember that they 
need the public and its goodwill to test their medicines in the first 
place. They may have to release more information than they would 
like, but if they do, it will safeguard the trust and support of the people 
on whom they ultimately rely. = 


essential roles.” 


Risk management 


Teams aimed at preventing violence on campus 
can offer a lifeline to those in crisis. 


Amy Bishop walked into a faculty meeting with a loaded pistol 

and shot six of her colleagues, killing three. Acts of violence 
involving multiple victims are extremely rare, especially on college 
campuses, which tend to be safer than the areas that surround them. 
But highly publicized events such as Bishop's rampage and the shooting 
at Virginia Polytechnic Institute and State University in Blacksburg in 
2007 — which had one of the highest death tolls of any attack on a col- 
lege campus — have spurred rapid growth in what is known as threat 
assessment and management. Developed by behavioural psychologists 
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working with agencies such as the US Secret Service, threat assessment 
aims to identify concerning behaviour and situations, and to take pre- 
emptive action to stop them escalating into violence. 

This can involve simply confronting an individual about inappro- 
priate behaviour — aggression towards colleagues, for example — and 
working with them to correct it. Or it can include maintaining con- 
tinual contact with an individual and putting them in touch with any 
help they might need, such as mental-health services. 

It is a challenging goal. Universities are big, complex environments 
where many students, staff and members of the community interact, 
not always peacefully. But existing networks that organize and moni- 
tor housing, health, grades and social activities do offer ways for uni- 
versities to identify aberrant or shifting behaviour, as well as a robust 
support structure to get people back on track. 

A News Feature on page 150 explores the growth of such pro- 
grammes and teams, particularly in the United States, where easy 
access to guns and several high-profile shootings have put the public 
on high alert. There seem to be some clear benefits, but the spread of 
these interdisciplinary teams, which often include law-enforcement 
officials and representatives of university mental-health services, also 
presents several risks. 

One risk that team members often worry about is how to balance 
individuals’ civil liberties with the need to protect others. In an age in 
which privacy is increasingly illusory, life within the boundaries of a 
college campus can be put under close scrutiny with little effort. And 
freedoms of speech and expression must be maintained if institutions 
of higher education are to continue to nurture ideas. 

Another risk of the focus on threat assessment is more subtle, and 
relates to the all-too-easy assumption that people who commit unthink- 
able acts of violence are driven by mental illness. It is true that mental 
illness is implicated in many high-profile cases of targeted violence and 
that many behaviours that would initiate a call to a threat-assessment 
team are related to a deteriorating mental state. But the links between 
violence and mental illness are complex and hardly correlative. Most 
violence is perpetrated by people who are not mentally ill, and most 


people with mental-health problems do not commit violent crimes. 
The rhetoric of threat assessment therefore runs the risk of further 
ostracizing people who already face stigma. Many cases managed bya 
threat-assessment team — there are several hundred referrals per year 
at an institution such as Virginia Tech — are for students or staff going 
through a crisis in their personal or professional life. Practitioners 
are quick to point out that theirs is a support-focused process, more 
about putting individuals in touch with the help they need to weather 
that crisis than punishing them, banishing them or branding them as 
potential threats. 


“For students Such nuances can be hard for an individual 
these services to remember when facing a threat-assessment 
can be investigation. And the leading part played by 
extraordinarily law-enforcement officials in proceedings 
helpful, even adds an air of presupposed criminality. 


All of this is not to devalue the efforts of 
these teams. They can be among the first to 
recognize and the most eager to serve those struggling with mental 
illness. And they often partner with other student-service organiza- 
tions whose goals are not focused on averting the next mass shooting. 
Ifa case is not deemed particularly risky, threat-assessment teams may 
pass it over to these groups. For students, who are often facing unfa- 
miliar challenges, these services can be extraordinarily helpful, even 
life-saving. Many referrals to threat-assessment teams are prompted 
by threats of suicide, for example. 

The politics at play here are sadly familiar in the United States. A 
highly publicized mass shooting is followed by calls for stricter gun 
control, followed by pressure from gun supporters to maintain the 
status quo or even loosen restrictions on firearms. Somewhere along 
the line, fingers are pointed to the role mental illness had in the attack 
and attention shifts to the dismal state of mental health care in the 
country. Accusations are made, as are promises, but little is done. 
Threat assessment may not be a solution to violence, but if it means 
that more people get the help they need, irrespective of whether it 
staves off the next attack, then, to some people at least, it is a success. m 


life-saving.” 


Conflict of interest 


How two world wars affected scientific 
research, and vice versa. 


his year marks the anniversary of two significant events from 

the last century, perhaps the most significant of any century: 

100 years since the outbreak of the First World War and 75 years 
since the start of the Second World War. It is natural for specialist pub- 
lications to search out a ‘local’ angle on major news events, and Nature 
is no different. When it comes to modern warfare, however, the task 
is easier than with most events, for science is not a tangential topic in 
armed conflict. It lies, for both good and evil, at its heart. 

We live, said Martin Luther King, in an age of guided missiles and 
misguided men. Scientists can do little about the latter (although we 
must still try), whereas the former shows the contradictions of military 
research in all its shades of grey. If we are to kill people, then is it a good 
thing that we are able to target them more precisely? The death of one 
becomes more likely; the deaths of others less so. 

In times of war, such ethical tongue-twisters tend to give way to 
the pragmatism of national politics. In 1943, James Collip, one of 
the “Toronto group of scientists that isolated insulin, observed that: 
“Today, with total war upon the world, there can be no doubt that more 
than ever before in history this war is a contest between the brains, 
imagination, inventiveness and teamwork of the scientists and produc- 
tion workers of one group of nations pitted against those of another 
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group.” Whereas the first three of those attributes were always com- 
mon in science, teamwork, as Collip pointed out, came less naturally. 

There are two ways to address the topic of science and war. The first, 
and the most conventional route, is to assess the impact that research 
has on conflict. Science in the First World War marked a turning point 
in tactics; no longer was a speedy and resourceful attacker likely to win. 
With machine guns and barbed wire at the front line, and behind them 
railroads for resupply, a well-dug-in defender became the favourite. 
(The US Civil War had demonstrated this too, but European generals 
were slow learners.) Technology made warfare asymmetric, and it 
has remained that way — the dreadful stalemate of mutually assured 
destruction by nuclear weapons notwithstanding. 

The second route is to look at the reverse of the equation: how has 
conflict influenced research? What lessons are there for peacetime 
science in the panicked scramble of work that aims not to understand 
how the world works and to improve quality of life, but to ensure that 
it remains at all? 

Nature intends to address both topics in several articles this year. 
And we kick off this week with a good example of each. On page 156, 
Sharon Weinberger reviews two books that analyse the wartime role 
of physics and psychology. And on page 153, David Kaiser explores 
how practical ways of getting US physicists to work together during 
the Second World War had an enduring impact on the organization 
and funding of science. For one thing, Kaiser writes, it turned ona “fire 
hose’ of federal funds for research, a model that 
continues. The teamwork continues too, and 
if the stakes for winning and losing are lower 
now than when the original collaborations were 
forged, that can only bea good thing. m 
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WORLD VIEW .jecnisicor sen 


Australis, safe with friends and colleagues and heading back to 
civilization, I can say it has been a remarkable journey. 

For the past six weeks on board the Russian icebreaker 
MV Akademik Shokalskiy, my colleague Chris Fogwill and I have led a 
team of scientists, science communicators and volunteers on a voyage 
from the New Zealand subantarctic islands to the East Antarctic Ice 
Sheet. The aim was to study various aspects of this vast, remote region 
to better understand its role in the Earth system, and communicate 
these results directly to the public. Yet most people only became aware 
of our work when we got stuck and had to be rescued. 

That is the reality of polar science. It is difficult. Almost every sea- 
son, ships get caught in sea ice, teams lose communications and planes 
are sometimes tragically lost. Signatories to the 
Antarctic Treaty understand that one science pro- 
gramme supports another, so national and non- 
government vessels routinely assist each other. 

What went wrong for the Shokalskiy? Contrary 
to some reports, the ship was not frozen in but 
was pinned by remobilized sea ice that had been 
blown by fierce winds. Most importantly, the 
team is safe and we are incredibly grateful to the 
international effort to help us. 

Could this have been avoided? The satellite 
data leading up to our arrival in Antarctica’s 
Commonwealth Bay indicated open clear water, 
and the area seemed to have been that way for 
some time. As the Shokalskiy attempted to leave, 
however, we found ourselves surrounded by a 
mass breakout of multiyear ice. This was a major 
event, with the vessel surrounded by blocks of sea ice more than three 
metres thick, apparently arriving from the other side of the Mertz 
Glacier. Despite the best efforts of our captain, we could not find a route 
out. It was deeply frustrating. We had been caught just 2-4 nautical 
miles (3.7-7.4 kilometres) from the edge of the sea ice. And with perva- 
sive southeasterly winds battering our location, this distance increased 
to 20 nautical miles within 48 hours. 

The extreme nature of the conditions is shown by the fate of the 
Chinese icebreaker that came to our rescue: as I write this, that vessel 
is also now trapped, and is awaiting the arrival of the huge US ship 
Polar Star to smash a route to open water. 

Since news of our plight raced around the world, Ihave been surprised 
by the level of criticism our scientific expedition has received. This was 
no pleasure cruise. The science case took two years to develop, and was 
approved by the New Zealand Department of 


Si in the ship’s lounge of the Australian icebreaker Aurora 


Conservation, the Tasmanian Parks and Wildlife NATURE.COM 

Service and the Australian Antarctic Division. Discuss this article 
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the region we sailed into. A southward shift —_ go.nature.com/whm6d6 


NEVER BEFORE HAS A 
SCIENCE EXPEDITION 


REACHED OUT 


LIVE TO SO MANY 
PEOPLE FROM SUCH A 


REMOTE 
LOCATION. 


This was no Antarctic 
pleasure cruise 


After his polar vessel became trapped in shifting sea ice, Chris Turney 
defends the scientific basis of the expedition. 


of westerly winds is influencing the Antarctic Circumpolar Current, 
increasing transitory upwelling of Circumpolar Deep Water onto the 
Antarctic continental shelf. At the same time, extensive sea ice has 
formed in Commonwealth Bay after a huge iceberg called BO9B col- 
lided with and destroyed the tongue of the Mertz Glacier in 2010. This is 
adjacent to the Mertz polynya — a stretch of open water surrounded by 
ice anda major source of Antarctic bottom water formation. We wanted 
to gather data on the effects of both these events on circulation, ocean 
properties, biodiversity and stability of the East Antarctic Ice Sheet. 

Never before has a science expedition reached out live to so many 
people from such a remote location. Public engagement was always a 
core theme. Well before we ran into trouble, we posted daily online 
reports of our research and aspects of life on the vessel and in the 
field. In recent weeks, this extended to reassur- 
ing those at home about the well-being of all on 
board. When the number of television and radio 
interviews increased, so did our mentions of the 
science. This encouraged people to follow our 
work, as seen by the number of hits received on 
the expedition website. In the past six weeks, 
wwwsspiritofmawson.com received 60,000 visits, 
driving traffic to our social media sites. 

Our findings include many firsts for the region: 
detailed marine and terrestrial ecological studies, 
glaciological reconstructions and high-resolu- 
tion palaeoclimate analysis of tree rings, peats 
and ocean cores from the subantarctic islands. 
Guided by real-time satellite information, the 
team undertook an experiment across the Antarc- 
tic Convergence — a natural boundary between 
cold Antarctic and warmer subantarctic waters. By combining surface 
drifters with Argo floats (for measuring salinity and temperature), we 
have gained a unique snapshot of this important frontier. 

Reaching Commonwealth Bay, we crossed some 65 kilometres of sea 
ice to deliver scientists and conservators to the historic base established 
by scientist and explorer Douglas Mawson a century ago. We surveyed 
an airstrip for future visits, serviced and collected data from the auto- 
matic weather station, and obtained valuable Global Positioning System 
data for monitoring land-mass uplift as ice sheets retreat. 

Our rescue has caused disruption, but fortunately we hear that the 
next voyage of the Aurora Australis is likely to leave Hobart as sched- 
uled. Science will continue in the south: a great relief. In the meantime, 
the value of our expedition must be judged by the quality of the research 
it always intended to produce, and the remarkable rekindling of public 
interest in science and exploration that has come with it. = 


Chris Turney is a scientist in the Climate Change Research Centre at 
the University of New South Wales, Australia. 
e-mail: c.turney@unsw.edu.au 
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RESEARCH HIGHLIGHTS sxiczeon 


In vivo cell switch 
for brain repair 


By reprogramming one type of 
brain cell into another in vivo, 
researchers have opened the 
door to new ways of repairing 
damaged brains. 

Gong Chen and his 
colleagues at Pennsylvania 
State University in University 
Park converted reactive glial 
cells — the cells that flood 
sites of brain injury — into 
neurons in the brains of mice. 
They injected a retrovirus 
carrying the gene encoding a 
protein called NeuroD1 into 
the cortex of normal mice and 
those engineered to model 
Alzheimer’s disease. The virus 
delivered the gene to two types 
of glial cell, resulting in the 
reprogramming of these cells 
into functional excitatory or 
inhibitory neurons. 

NeuroD1 also turned human 
astrocytes, a type of glial 
cell, into functional neurons 
in vitro. The authors suggest 
that the approach could be used 
to replace neurons lost to injury 
or disease in humans. 

Cell Stem Cell http://doi.org/ 
qq7 (2013) 


Extra-stretchy 
graphene gloves 


Graphene-based sensors 
that measure strain, or 
deformation, can be stretched 


Comets hint at cosmic encounter 


Researchers have discovered a second comet 
belt in Fomalhaut, a bright, triple-star system 
that is already known to host an exoplanet and 
a bright comet belt around its primary star, 


Fomalhaut A (pictured). 


Grant Kennedy at the University of 
Cambridge, UK, and his colleagues used data 
from the Herschel Space Observatory to find 
the second belt, which surrounds the system's 
least-massive star, Fomalhaut C. The discovery 


to twice their normal length. 
These could be useful for 
the development of wearable 
interactive electronics. 

Previous such devices 
struggled to stretch by even 
30%, making them too stiff 
to detect the full range of 
motion of human joints, for 
instance. Pooi See Lee and 
her colleagues at Nanyang 
Technological University in 
Singapore made their sensors 
out of nanopaper: crumpled 
graphene (atom-thick sheets 
of carbon) and tiny cellulose 
fibrils embedded in a stretchy 
material. 

A glove developed by the 
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of two such bright comet belts around stars in 
the same system is rare, and suggests that the 
two stars had a close encounter that increased 
collision rates within each debris disk: colliding 


comets generate large amounts of dust and ice 


comet ring. 


researchers has strain sensors 
on each finger (pictured) for 
measuring the bending and 
stretching of separate digits. 

It could one day be used to 
perform surgery remotely and 
in other applications. 

Adv. Mater. http://doi.org/qrh 
(2013) 


Dietary fibre 
dampens asthma 


A high-fibre diet curbs allergic 
inflammation in mouse lungs 
by shifting the composition of 
microbes in the gut. 


that would make the comet belts look brighter. 
Such a stellar interaction could also explain the 
elliptical orbits of Fomalhaut A’s exoplanet and 


Mon. Not. R. Astron. Soc. 437, 2686-2701 (2014) 


Benjamin Marsland at 
the University of Lausanne 
in Switzerland and his 
colleagues raised mice on diets 
containing different levels of 
fibre and exposed the animals 
to extracts of house-dust 
mite, a cause of asthma. The 
resulting lung inflammation 
was less in mice consuming 
high levels of fermentable 
fibre than in those ona low- 
fibre diet, and the animals 
also harboured a community 
of intestinal microbes that 
generated higher levels of 
short-chain fatty acids when 
metabolizing fibre. These 
fatty-acid molecules boosted 


ALMA (ESO/NAOJ/NRAO); VISIBLE LIGHT IMAGE: NASA/ESA/HST 
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the generation of immune 
cells called dendritic cells that 
were less able to trigger allergic 
inflammation in the lungs. 
The results provide a 
possible link between the 
rising incidence of asthma 
in developed countries and 
decreasing dietary-fibre intake. 
Nature Med. http://dx.doi. 
org/10.1038/nm.3444 (2014) 


ZOOLOGY 


Tobacco breath 
aids defence 


The tobacco hornworm, which 
feeds on tobacco plants, exhales 
some of the ingested nicotine to 
repel predators. 

Tan Baldwin and his 
colleagues at the Max Planck 
Institute for Chemical Ecology 
in Jena, Germany, glued tiny 
sensors to the mouths of 
tobacco hornworm larvae 
(Manduca sexta; pictured) 
to measure the levels of 
nicotine in their breath. They 
found that larvae that fed on 
engineered, nicotine-free 
tobacco plants (Nicotiana 
attenuata) exhaled less 
nicotine than larvae fed 
on normal plants, as did 
larvae in which an enzyme 
that transfers nicotine from 
the gut into the circulation 
was silenced. Wolf spiders 
(Camptocosa parallela) 
preferred to prey on larvae 
with less nicotine on their 
breath. 

The study shows how 
ingested toxic chemicals can 
be used for predator defence. 
Proc. Natl 
Acad. Sci. USA 
http://doi.org/ 
qq3 (2013) 


CRISPR screen 
identifies genes 


Two teams show howa 
genome-editing system can be 
used to screen human cells for 
genes of interest. 

The CRISPR system allows 
biologists to edit specific genes 
using ‘guide’ RNA molecules 
that target them. Feng Zhang 
at the Broad Institute in 
Cambridge, Massachusetts, 
and his colleagues created a 
library of 64,751 guide RNA 
sequences that target 18,080 
genes in human cells. Using 
this library, the researchers 
pinpointed genes that are 
required by cancer and stem 
cells to survive. They also 
teased out genes that, when 
lost, allow cancer cells to 
fend off the melanoma drug 
vemurafenib. 

A separate team led by 
Eric Lander at the Broad 
Institute and David Sabatini 
at the Whitehead Institute 
for Biomedical Research in 
Cambridge, Massachusetts, 
used a library of 73,000 
guide RNAs to screen for 
several genes, including 
those involved in resistance 
to the chemotherapy drug 
etoposide. 

Science 343, 80-84; 84-87 
(2014) 


Better pictures of 
protein structures 


A modified method for 
determining the three- 
dimensional structure 

of large proteins seems 

to show them ina 
more natural pose than 
conventional techniques do. 
Proteins called G protein- 
coupled receptors (GPCRs) 
are important drug targets, but 
researchers struggle to figure 
out their structures. Vadim 
Cherezov of the Scripps 
Research Institute in La Jolla, 
California, and his colleagues 
modified the standard X-ray 
crystallography technique by 
using an X-ray free-electron 
laser to capture serial images 
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Chemotherapy kills cancer cells, but in 
colorectal cancer it can also stimulate 
their growth by activating cells called 


fibroblasts in the connective tissue. 

Matthew Kalady and Jeremy Rich at the Cleveland Clinic 
in Ohio and their colleagues analysed tumours from patients 
with colorectal cancer before and after chemotherapy. The 
researchers found that the abundance of cancer-associated 
fibroblasts increased after treatment, and that these cells 
enhanced the ability of a subset of cancer cells to initiate 
tumour growth. The fibroblasts seem to do this by secreting 
signalling proteins, including one called IL-17A. 

The findings suggest that chemotherapy can trigger drug 
resistance by changing the tumour’s microenvironment. 
Disrupting this mechanism could be a way of improving 


cancer therapies, the authors say. 


J. Exp. Med. 210, 2851-2872 (2013) 


of the structure of a GPCR 
for the neurotransmitter 
serotonin. 

The team used the 
technique on tiny serotonin- 
receptor crystals kept at room 
temperature, and obtained 
structures that differed from 
those determined using 
conventional approaches 
with larger crystals kept 
at cool temperatures. The 
results suggest that the 
room-temperature free- 
electron-laser approach may 
better capture the protein's 
conformation in its native 
environment. 

Science 342, 1521-1524 
(2013) 


Radar signals 
sinkhole to come 


Radar measurements taken 
more than a month before 

a giant sinkhole (pictured) 
opened up in 2012 in Bayou 
Corne, Louisiana, reveal 
that nearby ground shifted 
horizontally towards the pit’s 
location. 

Cathleen Jones and 
Ronald Blom of NASA’ Jet 
Propulsion Laboratory in 
Pasadena, California, looked 


at radar data gathered by an 
unmanned aircraft as part 

of a Mississippi River delta 
study. By comparing data from 
flight passes in June 2011 and 
July 2012, the team saw that 
surface material had moved 
by as much as 26 centimetres 
towards where the 110-metre- 
wide sinkhole appeared in 
August 2012. 

Radar remote sensing could 
bea way of predicting the 
formation of these potentially 
catastrophic sinkholes and 
their growth rate, the authors 
say. 

Geology http://doi.org/qnr 
(2013) 
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SEVEN DAYS 


Seeds of change 

The US Department of 
Agriculture (USDA) on 

3 January proposed removing 
restrictions on the use of maize 
(corn) and soya bean seeds 
that are genetically engineered 
to tolerate 2,4-D, a weedkiller 
that is commonly used on 
other crops. More widespread 
use of the genetically modified 
seeds, which are made by Dow 
AgroSciences in Indianapolis, 
Indiana, along with 2,4-D 
could drive evolutionary 
selection for weeds that are 
resistant to the chemical, the 
agency cautioned. But the 
USDA noted that the move 
would provide a much- 
needed tool for farmers to 
manage fields that are already 
plagued by weeds resistant to 
another weedkiller, glyphosate 
(Roundup). 


Clinical data 

A lack of access to clinical-trial 
data is hindering research and 
medical care, according to a 
British government report 
released on 3 January. In the 
report, which also examined 
the United Kingdom's 
stockpiling programme for 
the influenza drug Tamiflu 


The increase in female 
speakers at the American 
Society for Microbiology 
general meeting in sessions 
organized by teams that 
included at least one woman, 
compared with those set up 
by all-male teams, according 
to a study of conferences 
from 2011 to 2013. 

Source: Casadevall, A. & 
Handelsman, J. mBio http://dx.doi. 
org/10.1128/mBio.00846-13 (2014). 


The news in brief 


China joins ivory-crushing campaign 


China's government crushed more than six 
tonnes of seized ivory in Dongguan on 6 January, 
as part of a global effort to crack down on illegal 
trading in smuggled tusks and carvings. The 
move was China’ first public destruction of 


(oseltamivir), the authors 

say that evaluation of the 
efficacy of Tamiflu and other 
medicines has been hampered 
by drug manufacturers 
withholding data. The report 
follows recent European and 
US initiatives to increase data 
sharing and transparency in 
clinical trials. See page 131 
and go.nature.com/91gbd6 for 
more. 


Gun controls 

The US Department of 
Health and Human Services 
proposed on 3 January that 
patient-privacy exemptions 
should be created so that 
relevant mental-health 
records can be submitted to 
the national databases used to 
screen potential gun buyers. 
So far, background checks 
have prevented the sale of 
more than 2 million firearms, 


according to the White House. 
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But some researchers who 
study firearms violence have 
called for better safeguards 
against gun ownership by 
those who are mentally ill (see 
Nature 496, 412-415; 2013). 


Cannabis commerce 


The world’s first legal market 
for recreational marijuana 
opened in Colorado on 

1 January. Colorado is one 
of only two US states to have 
approved non-medicinal 

use of marijuana by adults 
(see go.nature.com/rtr30u). 
Although the drug remains 
illegal under national laws, 
the US government has said 
that it will not interfere with 
state industries kept under 
strict local controls. Last 
month, Uruguay became 

the first country to approve 

a national market for legal 
marijuana, which has yet to be 
implemented. 


ivory, showing the country’s intention to thwart 
a worrying rise in elephant poaching (see Nature 
503, 452; 2013). The United States crushed 

a similar weight of ivory last November. See 
go.nature.com/ib2fpa for more. 


New drug approvals 
The US Food and Drug 
Administration approved 

27 new drugs in 2013, down 
from a 15-year high of 39 drugs 
the year before, which some 
analysts had taken as a signal 
of revived fortunes in the 
pharmaceutical industry. 

The agency told reporters 

that it had received fewer 

drug applications for review 

in 2013 and that the number 

of approvals was in line with 
the average of 28 new drugs 
approved annually over the 
past five years. 


Asteroid ahoy! 

A small asteroid disintegrated 
above the Atlantic Ocean on 

2 January, becoming only the 
second space rock to be spotted 
hours before it hit Earth. The 
Catalina Sky Survey near 
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US ICE 


SOURCE: N. ENGL. J. MED. 


Tucson, Arizona, discovered 
asteroid 2014 AA in the small 
hours of 1 January. The rock, 
some 2 to 3 metres across, 
burned up on hitting Earth's 
atmosphere. In October 2008, 
researchers tracked asteroid 
2008 TC; all the way from 
space to the desert wastes 

of northern Sudan, where 
fragments of it were recovered 
as meteorites (see Nature 458, 
401-403; 2009). 


Cold comfort 


Scientists, journalists and 
tourists were rescued from 
the Russian ship Akademik 
Shokalskiy in the Antarctic 

on 2 January. The vessel had 
been on a research voyage 
when it became trapped by 
ice near Commonwealth 

Bay on 24 December. 

Chinese icebreaker Xue Long 
transferred the stranded 
passengers to an Australian 
icebreaker, but later reported 
that it, too, had become 

stuck. A US icebreaker was 
dispatched on 5 January to 
assist the Russian and Chinese 
vessels. See page 133 for more. 


Fossil felony 


A fossil retailer from Eagle, 
Colorado, pleaded guilty 

on 2 January to conspiracy 
to smuggle dinosaur bones 
and other fossils into the 
United States from China 
and Mongolia. John Richard 


TREND WATCH 


Recent years have seen a declining 
share of the US economy spent 
on biomedical research and 
development (R&D), while 
several Asian nations are boasting 
growing investments (see chart), 
according to a study published 

on 2 January (J. Chakma et al. 

N. Engl. J. Med. 370, 3-6; 2014). 
The shifting trends mark changes 
in spending by the biomedical 
industry, perhaps reflecting 

lower labour costs and more 
government subsidies for 
commercial R&D in Asia, the 
authors suggest. 


Rolater agreed to surrender 
any claims to the illegally 
obtained goods, which 
include a fossilized skull of a 
juvenile Tyrannosaurus bataar 
(pictured) that is estimated to 
be worth US$1,875,000. Other 
items include a sabre-toothed 
cat skull and dinosaur eggs. 
Rolater has also agreed to pay 
a $25,000 fine and submit 

to two years of supervised 
probation. 


Turing pardon 


British mathematician 

Alan Turing has received a 
posthumous royal pardon. In 
1952, Turing was convicted of 
‘gross indecency’ under anti- 
homosexuality legislation, and 
later took his own life. Turing’s 
work in the Second World War 
helped to break the German 
Enigma cipher, and his 
concept ofa universal “Turing 
machine’, a programmable 
system that stores and 
processes information, is 
considered a cornerstone of 
computer science (see Nature 
482, 441; 2012). 


ASIAN R&D BOOM 


Falsified research 
The US Office of Research 
Integrity has sanctioned two 
biomedical researchers in 
seven days. On 30 December, 
the agency reported that 
Baoyan Xu, a former 
postdoctoral fellow at the 

US National Institutes of 
Health (NIH) in Bethesda, 
Maryland, had published 
falsified data on the immune 
responses of patients with 
hepatitis to a newly discovered 
virus. A week earlier, Dong- 
Pyou Han, a former research 
assistant professor at lowa 
State University of Science 

and Technology in Ames, 

was found to have falsified 
results when researching 

a vaccine against human 
immunodeficiency virus 1 
(HIV-1) by spiking rabbit 
blood samples with antibodies. 
The false results had been 
reported widely at national and 
international meetings, and in 
NIH grant applications. 


Open access 

An international open-access 
effort kicks off this month 

to make all particle-physics 
research articles freely 
available to readers. The 
Sponsoring Consortium for 
Open Access Publishing in 
Particle Physics (SCOAP*) is 
led by CERN, Europe's high- 
energy physics laboratory 
near Geneva, Switzerland. 


Countries including China, India and South Korea boost 
biomedical spending as US dominance slips. 
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SEVEN DAYS | THIS WEEK | 


10-15 JANUARY 
Understanding how 
cells’ nuclear receptors 
regulate gene expression 
is the focus ofa 
Keystone Symposium 
on Molecular and 
Cellular Biology in 
Taos, New Mexico. Hot 
topics include the roles 
of nuclear receptors 

in wound healing and 
cancer progression. 
go.nature.com/rs9oyb 


15-17 JANUARY 
The 8th Human 
Amyloid Imaging 
meeting in Miami, 
Florida, will discuss 
the latest research 

on measuring and 
interpreting changes in 
amyloid protein in the 
brain, as well as other 
biomarkers linked to 
Alzheimer’s disease. 
go.nature.com/oi5wkp 


The project has already 
experienced a few hiccups, 
with some major journals 
and universities opting not to 
participate. See page 141 

for more. 


} FUNDING 
Cancer donation 


Six US research centres have 
received a combined donation 
of US$540 million from 

the estate of late shipping 
magnate Daniel Ludwig. The 
gift will boost funding for 

the Ludwig Centers at Johns 
Hopkins University, Harvard 
University, the Massachusetts 
Institute of Technology, 
Memorial Sloan-Kettering 
Cancer Center, Stanford 
University and the University 
of Chicago. Announced on 

6 January, the donation brings 
the total contribution to 
cancer research by Ludwig and 
his estate to $2.5 billion. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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NEWSIN FOCUS 


access scheme for particle 


PUBLISHING Tensions as open- 
physics goes live p.141 


EARTH SCIENCE Mini-satellite 
swarms coming soon to 
heavens near you p.143 


FUNDING For first time 


ever, China tops Europe 
in R&D intensity p.144 


} COMPUTERS Artificial- 
intelligence goals edge 
closer to reality p.146 


Potential patients have offered vocal support for Stamina’s stem-cell treatment in Italy. 


REGENERATIVE MEDICINE 


Leaked files slam 
stem-cell therapy 


Disclosures and resignations reveal scientific concerns over 
methods of Italy’s Stamina Foundation. 


BY ALISON ABBOTT 


series of damning documents seen by 
Anne expose deep concerns over the 
safety and efficacy of the controversial 
stem-cell therapy promoted by Italy’s Stamina 
Foundation. The leaked papers reveal the true 


nature of the processes involved, long withheld 
by Stamina’s president, Davide Vannoni. Other 


disclosures show that the successes claimed 
by Stamina for its treatments have been over- 
stated. And, in an unexpected twist, top Ital- 
ian scientists are dissociating themselves from 
an influential Miami-based clinician over his 
apparent support for the foundation. 
Stamina, based in Brescia, claims that it suc- 
cessfully treated more than 80 patients, mostly 
children, for a wide range of conditions, from 


Parkinson's disease to muscular dystrophy, 
before the health authorities halted its opera- 
tions in August 2012. A clinical trial to assess 
the treatment formally was approved by the 
Italian government last May, and an expert 
committee was convened by the health min- 
istry to study Stamina’ method and to recom- 
mend which illnesses the trial should target. 

Stamina says that its technique involves 
extracting mesenchymal stem cells from a 
patient’s bone marrow, culturing them so that 
they turn into nerve cells, and then inject- 
ing them back into the same patient. But full 
details of the method have never been revealed, 
and Vannoni provided the full protocol to the 
expert committee only in August. 

In October, the committee’s report 
prompted health minister Beatrice Lorenzin 
to halt plans for the clinical trial. That led to 
public protests in support of Stamina, and, 
after an appeal by Vannoni, a court ruled in 
early December that the expert committee 
was unlawfully biased. Some members had 
previously expressed negative opinions of the 
method, the ruling said. As a result, Lorenzin 
appointed a new committee on 28 December, 
reopening the possibility of a clinical trial. 

Stamina’s protocol, together with the origi- 
nal committee's report, was leaked to the press 
on 20 December (Nature has also been shown 
transcripts of the committee’s deliberations). 
The leaked papers reveal that the original 
expert committee identified serious flaws 
and omissions in Stamina’ clinical protocol. 
It did not apply legally required Good Manu- 
facturing Practice standards, the committee 
says. The protocol exposed an apparent igno- 
rance of stem-cell biology and relevant clinical 
expertise, the report argues, as well as flawed 
methods and therapeutic rationale (see ‘Pro- 
tocol opinion). 

A week after the leaks, the health minis- 
try revealed that the condition of 36 patients 
treated with Stamina’s stem-cell therapy had 
not improved, contrary to Vannoni’s claims 
that more patients had shown improvement. 

The leaked documents also convey the 
original committee members’ disquiet over the 
unusually strict confidentiality agreement that 
they had to sign. This prevented them from 
ever divulging details of the protocol — and 
each committee member received an indi- 
vidualized copy of the method from Stamina 
to aid in the identification of any leak. The 
committee argued that such secrecy was > 
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PROTOCOL OPINION 


What the expert committee said on Stamina’s methods. 


The report of the original expert committee 
tasked with looking at Stamina’s clinical 
protocol includes the following opinions: 

The protocol contains no method for 
screening for pathogens such as prions or 
viruses, even though the culture medium 
used could contain them. 

The method it describes for producing 
mesenchymal stem cells, say the experts, 
would generate a mixture of different 
cell types that could include blood-cell 
precursors and bone fragments. 

The method it describes for checking 
the biological identity of the cells uses 
inappropriate cell-surface markers and no 
functional assay. And the protocol does not 
include a method for making mesenchymal 
stem cells differentiate into neural cells, the 
rationale Stamina provided, along with the 
protocol, in support of the clinical value of 


> unnecessary because no intellectual prop- 
erty or commercial interests were involved. 
Vannoni insists that Stamina will not make a 
profit. He has also said that, in the opinion of 
the court, the committee “had neither the right 
nor competence” to comment on the protocol. 

Other events have further dented Stamina’s 
credibility. Leading researchers on the scientific 
advisory boards of two independent stem-cell 
initiatives headed by clinician Camillo Ricordi 
have resigned in protest over the apparent 
public support offered to Stamina by Ricordi. 
Ricordi, who works on diabetes at the Univer- 
sity of Miami in Florida, has in the past called 
Stamina’s method “safe” and “promising”. 

On 23 December, Carlo Croce, a cancer 
researcher at the Ohio State University in 
Columbus, resigned from the scientific com- 
mittee of one of the initiatives, the Ri. MED 
Foundation, a publicly funded regenerative- 
medicine institute being built in Palermo, Italy. 
Croce has called for Ricordi to be removed as 
Ri.MED’s president. Other committee mem- 
bers told Nature that they are also considering 
resigning from the Ri. MED committee. 

And at the end of December, cell biologist 
Carlo Redi of the University of Pavia, Italy, 


the method in three disorders it proposed 
that the clinical trials address. 

Even if the Stamina method did generate 
the desired cells, the experts’ report notes, 
there would be too few of them for the 
treatments Stamina proposes. The experts 
also criticized an ‘emergency’ measure in 
the protocol that would culture a patient’s 
sample once more just before injection if 
at the last minute numbers of stem cells 
appeared sparse. This would mean that 
treatments across patients in the clinical 
trial would not be standardized, they say. 

The clinical rationales provided by 
Stamina also contain conceptual errors, the 
experts Say, as itis broadly accepted by the 
scientific community that the stem cells can 
differentiate into only bone, fat or cartilage. 
Moreover, the committee notes that sections 
of the protocol are copied from Wikipedia. A.A. 


stem-cell biologist Giulio Cossu at University 
College London and Francesca Pasinelli, direc- 
tor-general of the Italian grant-giving charity 
Telethon, all resigned from the Cure Alliance, 
a lobby group for speeding up translational 
medicine that Ricordi launched. 

The scientists who resigned say that they 
were dismayed by Ricordi’s insistence that the 
value of Stamina’s therapy had not yet been 
proved or disproved, as well as his offer to test 
and possibly improve it in his Miami facilities. 

Ricordi has told Nature that he is “not at all 
in favour of Stamina, but in favour ofa verifica- 
tion process that is in the interest of all”. He also 
points out that new members have joined the 
committees of the two initiatives. In response 
to the resignations, Vannoni told Nature: “I 
think that if a scientist resigns from a research 
centre because a colleague of his decides to 
study a new method, he/she has not got a cor- 
rect approach to science.” 

Ricordi has featured regularly in the long 
battle over Stamina’s methods. For exam- 
ple, at the July 2013 meeting of the Congress 
of the Cell Transplant Society in Milan, he 
announced that he had seen the then-secret 
Stamina protocol and witnessed how patients 


in Brescia had improved. In a press statement 
that day, he said that he believed it to be a “safe” 
procedure. “The results and data I was able to 
see appeared promising,’ he said. 

On 7 January, he added that: “If someone has 
a protocol that results in clinical benefits — to 
be validated — I wouldn't question their level 
of stem-cell biology knowledge.” He also said 
he had seen certificates confirming that Stami- 
na’ cell samples were sterile. But Paolo Bianco, 
an expert in mesenchymal stem cells from the 
University of Rome La Sapienza, notes that 
these certificates do not cover the presence of 
prions or viruses in the sample. 

Ricordi backs a controversial proposal that 
cell therapies should not be regulated as medi- 
cines, as US and European regulators insist, but 
as transplants. He argues that because trans- 
plants are not subject to strict regulation, novel 
stem-cell therapies could be introduced more 
quickly. 

As editor-in-chief, Ricordi launched the 
official journal of the Cure Alliance, CellR*, 
in 2013. He has written editorials in the jour- 
nal attacking critics of his views on transplant 
regulation, including Bianco and Nature. 

Ricordi began lobbying the former Italian 
health minister Renato Balduzzi over his views 
on transplants at the end of March last year. On 
16 April, Balduzzi nominated him as president 
of Ri.MED. And in an e-mail to Nature on 2 Jan- 
uary this year, Ricordi said that Lorenzin — the 
current health minister — “has recently asked 
me to help in resetting regulation for cellular 
therapies” But Lorenzin told Nature that she 
had met Ricordi only once at a social occasion 
and that they had discussed a different topic. 

In December, Ricordi reached an agreement 
with Stamina to begin testing its patients’ cell 
samples in his facilities in Miami. He denies 
being in favour of Stamina, but says that he 
is an advocate of patients who need to know 
whether or not the treatment works. “Testing 
is the only way to bring clarity,’ he says. 

But Ruggero De Maria, science director of 
the Regina Elena National Tumour Institute 
in Rome, says: “Tests on samples have already 
been carried out independently at the Univer- 
sity of Modena in Italy. I feel offended when I see 
Ricordi praising Stamina and attacking experts.” 

Ricordi told Nature that “a mud machine is 
being orchestrated” against him. Vannoni says 
that Ricordi is objective and open to new ideas 
but is not “supportive” of him. = 
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MAXIMILIAN BRICE/CERN 


Particle-physics research is at the centre of a global push for open-access publishing. 


Particle -physics 
papers set free 


Tensions as open-access initiative goes live — without the 


field’s leading journal. 
BY RICHARD VAN NOORDEN 


as the largest-scale open-access initiative 

ever built: an international effort to switch 
the entire field of particle physics to open- 
access publishing. 

But the initiative, organized by CERN, 
Europe's high-energy physics laboratory near 
Geneva in Switzerland, has not yet fulfilled its 
dream — it currently covers only a little more 
than half of published particle-physics papers. 

The scheme’s scope was slashed in the sum- 
mer when the field’s largest journal, Physical 
Review D, pulled out, although its publisher, 
the American Physical Society (APS), did 
agree to publish papers on experiments at 
CERN’s Large Hadron Collider on an open- 
access basis without charging author fees. 

And as the starting gun sounded, a number 
of US libraries, including those of Stanford 
University in California and Yale University in 
New Haven, Connecticut, declined to pay into 
the system. Robert Schwarzwalder, associate 
university librarian for Stanford’s science and 
engineering libraries, wants to see open-access 


Jaws sees the start of what has been billed 


research but is not sure that CERN’S initiative is 
needed, given that versions of almost all high- 
energy physics articles already appear online 
for free on the preprint server arXiv and the 
repository INSPIRE-HEP. Yale did not return 
Nature’s calls. 

In the global scheme, called the Sponsor- 
ing Consortium for Open Access Publishing 
in Particle Physics 


(SCOAP*), libraries “Being able to 
either pay reduced flip four out of 
subscription fees for the five large 
participating jour- journals to 
nals, or stop paying open-access 
them altogether. The publishing is 
cash saved goesintoa qremarkable 
centralSCOAP’ fund, yesult.” 

used to pay publishers 


up front to publish open-access articles. 


Instead of hiding articles behind paywalls, 
publishers will make them immediately avail- 
able on their own websites, with generous 
rights for reuse. Authors will retain the copy- 
right. Libraries will not necessarily save money, 
because the average fee for publishing a paper 
— €1,150 (US$1,570) — has been set roughly 
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to match publishers lost subscription revenues. 
But in three years’ time, contracts may be rene- 
gotiated, so open-access fees might go down. 

SCOAP?’ was designed to flip journals to 
open access without disrupting research- 
ers funding arrangements: scientists would 
not have to pay fees from their own research 
grants. In 2012, after more than 6 years of 
discussion with over 1,000 libraries, library 
consortia and research organizations in 
25 countries, 12 journals signed up to the 
scheme. Six agreed to switch entirely to open- 
access models and six to publish particle- 
physics articles in an open-access format. It 
was “the most systematic attempt to convert 
all the journals in a given field to open access’, 
says Peter Suber, director of the Harvard Open 
Access Project in Cambridge, Massachusetts. 

Then, in June 2013, the APS announced that 
it would pull out of the scheme, taking with 
it two journals, including Physical Review D, 
which publishes more than 3,000 high-energy 
physics articles each year behind a paywall, 
although it also allows authors to pay to make 
individual articles open access. The APS says 
that it is committed to “sustainable open access’, 
and notes that it allows authors to post the final 
manuscripts of APS papers on their own web- 
sites. But the society wanted to maintain its 
“long-term financial stability’, it said. “The data 
available to us did not allay our long-standing 
concerns about the stability of SCOAP* and 
about the risks to Physical Review D from partic- 
ipating,” says Joseph Serene, treasurer and pub- 
lisher for the APS in College Park, Maryland. 

Salvatore Mele, who leads SCOAP” from 
CERN, says he regrets the APS decision, which 
has sparked a minor stand-off. CERN has 
decreed that all articles based on its research 
must be open access, but says it will not pay 
open-access fees to journals that either with- 
drew from or remained outside the SCOAP? 
process. The APS says that — apart from papers 
relating to the Large Hadron Collider — it will 
not waive open-access fees for CERN, so it will 
be up to individual authors to find the money. 

Schwarzwalder is also concerned about the 
long-term viability of the scheme. The danger, 
he says, is that libraries may be tempted to 
renege on their pledges in order to save money 
— because they will have access to the papers 
whether or not they pay. Libraries that do not 
pay into SCOAP”s €5-million pot will also 
effectively be freeloading on those that have. 
This could rapidly make SCOAP”s publishing 
economically unsustainable. Nevertheless, 
Schwarzwalder says Stanford is now recon- 
sidering its decision. 

Despite all the teething pains, Mele considers 
SCOAP?’ a success. It has created a worldwide 
community of funding agencies and libraries 
that believe it is possible to convert subscrip- 
tion fees into open-access fees. “Being able to 
flip four out of the five large journals to open- 
access publishing — which was considered to 
be impossible — is a remarkable result? m= 
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This woman’s rash is symptomatic of kala-azar, a parasitic disease spread by sandfly bites in the tropics. 


PHARMACEUTICALS 


Projects set to tackle 
neglected diseases 


But they do little to alter the process of drug development. 


BY ERIKA CHECK HAYDEN 


ala-azar, the most deadly parasitic 
kK disease after malaria, afflicts hundreds 
of thousands of the world’s poorest 
people in tropical countries such as India, 
Brazil and Sudan. Spread by sandfly bites, the 
disease can be fought with existing treatments 
— but these are expensive and inconvenient, 
and sometimes have toxic side effects. 

Yet commercial work aimed at finding better 
drugs for kala-azar has largely been abandoned. 
Pharmaceutical companies say that poor cus- 
tomers cannot afford to pay the high prices 
needed to recoup development costs. Critics 
say that eight proposals, endorsed last month 
by reviewers for the World Health Organiza- 
tion (WHO) to break the stalemate for this 
and other neglected diseases, are noble, but no 
solution. The measures will do little, they say, to 
solve a broader problem: the disparity in spend- 
ing on research and development for diseases 
of the rich and those of the poor. 

The proposal to combat kala-azar (also 
knownas visceral leishmaniasis, or VL) would 
combine groups already working on drugs for 
the disease into a single organization, the VL 
Global R&D & Access Initiative. This would 
seek to develop durable oral drugs that do not 
require cold storage or intravenous delivery. 
The non-profit plan will be considered by 


the WHO executive board at a meeting on 
20-25 January at the organization's headquar- 
ters in Geneva, Switzerland. 

But critics are upset that novel and more 
risky ideas that would have helped to unlink 
the cost of drug development from prices were 
eschewed in favour of the eight shortlisted pro- 
posals, which were seen as more viable because 
they build on existing efforts and focus on 
specific diseases. “The proposals that were 
brought forward were not as strong as we had 
hoped in identifying alternative pathways to 
traditional research and development through 
commercial channels,” says Nils Daulaire, 
assistant secretary for global affairs at the US 
Department of Health and Human Services. 
“That was, frankly, disappointing” 

In response to this criticism, the WHO has 
asked the backers of the eight projects — five of 
which focus on developing vaccines or medi- 
cines for specific neglected diseases, one on 
fever diagnostics and two on basic research 
— to explain this month how they will test 
methods for funding the work. The responses 
will help the executive board to decide which 
projects to endorse. Then, at a World Health 
Assembly meeting in Geneva in May, countries 
will be asked to commit funds for the schemes. 

The projects are part of an attempt to salvage 
a decade-long effort to create new funding 
mechanisms for neglected diseases. Despite 
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campaigning from advocacy organizations 
such as the Drugs for Neglected Diseases Ini- 
tiative in Geneva, and hefty donations from 
groups such as the Bill & Melinda Gates Foun- 
dation in Seattle, Washington, drug develop- 


ment is still disproportionately focused on . 


diseases of the rich, such as heart disease and 
cancer. 

Three times in the past decade, countries 
have failed to sign treaties that would commit 


them to fund drug development for neglected | 


diseases. When the latest attempt was quashed 
in November 2012, diplomats agreed instead 
to back a series of demonstration projects that 
would test new funding mechanisms and be 
reviewed in 2016. 

But critics worry that the eight shortlisted 
pilot projects are not actually testing new ways 
of funding, and that more innovative ones 
have been shelved. One proposal, rejected last 
month, would have used two tools — mile- 
stone payments and patent pools — to spur 
the development of tuberculosis medicines. 
Milestone payments would reward early-stage 
successes of potential drugs, such as proof of 
activity in humans. Recipients of the payments 
would then place intellectual property on these 
potential drugs into a patent pool. Drug devel- 
opers could license these patents at low cost 
and would agree to put further patents back in 
the pool. Another rejected proposal involved 
taxing antibiotic use to fund the development 
of antimicrobials. 

In their deliberations, reviewers were asked 
to score the projects’ public-health impact and 
scientific merit ahead of their novelty. Some 
neglected-disease advocates say that those 
priorities should have been reversed. Now 
that the more innovative projects have been 
dropped, “we're not going to get to the place 
in two years’ time where we can say how well a 
completely different approach to research and 
development can work’, says Katy Athersuch, 
an advocate for affordable medicines with the 
non-profit organization Médecins Sans Fron- 
tiéres (also known as Doctors Without Bor- 
ders), based in Geneva. 

Paying for the projects in the traditional 
way — by garnering direct support from donor 
nations — may be difficult enough. A 2012 
WHO report recommended that all countries 
spend 0.01% of their annual gross domestic 
product on neglected diseases, which would 
roughly double spending on these illnesses to 
US$6 billion per year. Only the United States 
is currently doing this, and emerging econo- 
mies such as China, Brazil and India have yet 
to increase their spending. 

If the WHO endorses some of the eight 
projects later this month, it will be a critical 
time to see if nations step up to pay for them, 
says John-Arne Roettingen, a global-health 
researcher at Harvard School of Public Health 
in Boston, Massachusetts. He says: “This will 
be the first test of whether countries are willing 
to put their money on the table” = 
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Many eyes on Earth 


Swarms of small satellites set to deliver close to real-time imagery of swathes of the planet. 


BY DECLAN BUTLER 


mapping tools to zoom in on high-resolu- 

tion satellite images of the planet taken just 
hours or days ago. Navigating backwards and 
forwards in time, one could track changes in 
everything from crops, forests and wildlife 
movement to urban sprawl and natural dis- 
asters, all with unrivalled temporal precision. 

This is the vision of two Californian start- 
up companies that are set to launch swarms 
of small imaging satellites, which, by virtue of 
their sheer numbers, will be able to revisit and 
photograph huge swathes of the planet as often 
as several times each day — a frequency much 
higher than that achieved by current Earth- 
observing satellites. 

San Francisco-based Planet Labs, founded 
in 2010 by three former NASA scientists, is 
scheduled to launch 28 of its ‘Doves on 8 Jan- 
uary. Each toaster-sized device weighs about 
5 kilograms and can take images at a resolution 
of 3-5 metres. 

At Skybox Imaging in nearby Palo Alto, 
plans are afoot for a swarm of 24 satellites, each 
weighing about 100 kilograms, which will take 
images of 1 metre resolution or better. Skybox 
launched its first satellite on 21 November and 
plans to launch another this year, followed by 
the remainder between 2015 and 2017. 

In a first — at least for civilian satellites 
— Skybox’s devices will also stream short 
segments of near-live high-resolution video 
footage of the planet. So, too, will UrtheCast, 


[== using Google Earth or other online 


THE SWARM COMETH 


Skybox Imaging plans to launch 24 high-resolution SkySats in the next few years. 


a start-up based in Vancouver, Canada, whose 
cameras will hitch a ride on the International 
Space Station (see go.nature.com/cebdkb). 
The efforts could herald a sea change for 
imaging. Conventional imaging satellites, 
which are the size of a van and weigh tonnes, 
cost hundreds of millions of dollars to build 
and launch (see “The swarm cometh’). Asa 
result, there are only a handful of operators, 
and the commercial world fleet comprises 


Small, light and cheap satellites could transform Earth observation. 


How they measure up to their larger brethren: 


Operator: Planet Labs 
Number of satellites*: 32 


Skybox Imaging 
24 
~100 kg 


Optical and near- 
infrared spectral bands 


=Al fan 


Weight: ~5 kg 
Instruments: Optical and 
near-infrared spectral bands 
Spatial resolution: 3-5 m 


*When fully operational + Without instruments 


NASA 

N/A 

2,071 kgt 

Multiple spectral bands 


DigitalGlobe 

N/A 

2,800 kg 

Multiple spectral bands 


15-100 mt 0.3-30 m? 


+ Depending on spectral frequency 


fewer than 20 satellites. Commercial satellites 
also tend to take pictures mainly when their 
operators receive orders from customers. 

By contrast, the swarm satellites’ cameras 
will always be on, photographing everything 
in their path and, owing to their numbers, 
will pass over the same points on Earth with a 
frequency of hours to a few days, depending 
on latitude. 

The biggest customers of conventional com- 
mercial imaging satellites are governments, 
in particular intelligence agencies and the 
military. Prices can be prohibitive for many 
other potential users, including researchers, 
in areas as diverse as farming, forest carbon 
management, regional and local planning, 
and environmental stewardship. By making 
their images cheaper, the new entrants into 
the marketplace hope to spur a proliferation of 
innovative uses. They also hope to offer heavy 
discounts or even make imagery free to aca- 
demics and non-governmental organizations. 

“This sector has for so long been driven 
by government requirements and, to a lesser 
extent, big industry players, that the mass- 
market consumer — the long tail — has been 
almost completely neglected,” says Scott Lar- 
son, chief executive of UrtheCast. Cheaper 
imagery, he says, will lead to “the democratiza- 
tion of near-real-time Earth-observation data’ 

To slash costs, Planet Labs and Skybox Imag- 
ing use off-the-shelf technologies from the > 
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> automotive, smartphone and other con- 
sumer industries — including low-cost elec- 
tronics, and sensors from high-end digital 
cameras. Using the latest technologies from 
these fast-paced industries also allows the 
rapid, continuous development of better 
and better satellites, says Will Marshall, chief 
executive of Planet Labs. And miniaturizing 
satellites reduces launch costs. 

Because the swarms are still to be 
launched, scientists have yet to fully assess 
the quality of the imagery. But the satellites’ 
spatial resolutions of 1-5 metres are much 
higher than those of most scientific satel- 
lites. Landsat, NASA’s Earth-observation 
workhorse, for example, has a resolution of 
15-100 metres depending on the spectral 
frequency, with 30 metres in the visible- 
light range. 

Such medium-resolution imagery is 
adequate for many purposes, but higher 
resolution can have benefits, says Dan 
Berkenstock, co-founder and chief prod- 
uct officer of Skybox Imaging. He points to 
a study published in November that found 
that the use of moderate-resolution Landsat 
imagery greatly underestimated forest loss 
in the Democratic Republic of the Congo 
(A. Tyukavina et al. Environ. Res. Lett. 8, 
044039; 2013). 

Precision agriculture, a method that uses 
remote sensing to aid farm management, 
will also benefit from swarms, says Berken- 
stock, because the technology will be able 
to provide timely crop-yield and health esti- 
mates down to the level of rows of plants. 
Such detail could inform decisions on ferti- 
lizer and irrigation use, but is currently out 
of reach of most farmers. 

However, spatial resolution is only part of 
the picture, says Mike Wulder, a researcher 
at the Canadian Forest Service in Victoria 
and a member of the Landsat science team. 
He uses remote sensing to study forests, and 
notes that good spectral and radiometric 
resolution (detection of small differences in 
wavelength and radiation, respectively), are 
essential for quantitative scientific analyses. 
“These very small satellites should not be 
expected to provide data that are similar 
or in competition with full-blown Earth- 
observing satellites,” he says. “They occupy 
a different niche” 

The scientific value of the swarm data 
will be “radically dependent” on quality 
issues, says Greg Asner, an Earth scientist 
at the Carnegie Institution for Science in 
Stanford, California. Stitching together 
such frequent-repeat imagery from so 
many satellites will be challenging, because 
performance will probably differ between 
satellites and vary over time, he argues. 

But he is nonetheless excited at the pros- 
pect of constantly updated fresh imagery. 
“It will almost be like updating Google 
Earth each day,’ he says. m 
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RESEARCH FUNDING 


BY RICHARD VAN NOORDEN 


y pouring cash into science and tech- 
Bees faster than its economy has 

expanded, China has for the first time 
overtaken Europe on a key measure of inno- 
vation: the share of its economy devoted to 
research and development (R&D). 

In 2012, China invested 1.98% of its gross 
domestic product (GDP) into R&D — just edg- 
ing out the 28 member states of the European 
Union (EU), which together managed 1.96%, 
according to the latest estimates of research 
intensity, to be released this month by the Paris- 
based Organisation for Economic Co-operation 
and Development (OECD). 

The figures show that China’s research 
intensity has tripled since 1998, whereas 
Europe’s has barely increased (see “Shoot- 
ing star’). The numbers are dominated by 
business spending, reflecting China’s push 
in the manufacturing and information- and 
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Research at computer firm Lenovo is helping to drive China’s rising R&D spending. 


China tops Europe 
in R&D intensity 


Reforms to commercial and academic research systems still 
needed despite reaching spending milestone, say scientists. 


communication-technology industries. 

James Wilsdon, a science-policy analyst at the 
University of Sussex in Brighton, UK, says that 
China's R&D juggernaut is “astonishing”, con- 
sidering that the entire system emerged only 
after the end of the Cultural Revolution in 1976. 
In absolute terms, China's R&D spending is still 
almost one-third lower than that of Europe, but 
the new figures are “a significant milestone’, 
says Wilsdon. 

The reorientation of China’s economy dis- 
plays its soaring ambition. However, money 
does not buy innovation. Despite success in 
some areas, notably high-speed rail, solar 
energy, supercomputing and space explora- 
tion, leaders in China are concerned that 
innovation is lacking, say science-policy 
analysts. “Chinese leaders would like some- 
thing equivalent to a Nobel prize, or a world- 
class product similar to an iPhone,” says 
Denis Simon, an expert on Chinese science 
and innovation at Arizona State University 
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in Tempe. “But there is a lot of risk aversion 
within the Chinese R&D system that doesn't 
allow for entrepreneurial behaviour” 

China’s leaders recognize the issues: the 
government is now reviewing a 2006 long- 
term plan on science and technology, and will 
be taking advice from international experts in 
Beijing this month. Lan Xue, director of the 
China Institute of Science and Technology Pol- 
icy at Tsinghua University in Beijing, expects 
some changes at the level of academic science. 
“Tm relatively optimistic that there will be 
improvement in how R&D programmes are 
managed and peer-reviewed.” 

In contrast to China’s rapid rise, Europe's 
R&D spending has remained stagnant. The 
continent has made little headway in the past 
decade on a long-term target to reach 3% of 
GDP by 2020. “The European Commission has 
long warned that China is catching up in terms 
of R&D intensity,’ says Michael Jennings, a 
spokesman for research at the commission. 
“The EU needs a real push now to increase 
R&D spending in the public sector, but espe- 
cially in the private sector” 

One problem is that the commission can- 
not dictate business spending for individual 
member states. Another is the expansion of the 
EU, which has brought down average research 
intensity. OECD figures show the stark con- 
trast between nations such as Germany, at 
2.92% of GDP, and newer EU members such 
as Croatia, at 0.75%. Jennings adds, however, 
that an almost 30% boost to Horizon 2020, the 
EU research programme, is a good sign. 

Some analysts argue that Europe does not 
need to be too worried by the stasis in research 
intensity. The number is an increasingly poor 
indicator of innovative activity, argues Kieron 
Flanagan, a science-policy analyst at the Uni- 
versity of Manchester, UK. For example, it fails 
to pick up on innovation in the service-oriented 
industries that dominate many Western econo- 
mies. An architectural or advertising firm could 
innovate while meeting the demands of a con- 
tract — making advances that could be widely 


SHOOTING STAR 

China has passed Europe in its investment in 
research and development (R&D) as a percentage 
of its gross domestic product (GDP). 

= United States 

== Europe 


= Japan 
= China 


R&D investment (% of GDP) 


0) 
995 


2000 2005 2010 


IN FOCUS | NEWS 


copied and meaningfully affect an economy. Yet 
they would not count as R&D spending. 

In China, meanwhile, “a great stodgy mass” 
of state-owned enterprises dominates commer- 
cial R&D spending — and they might actually 
suppress innovation, says Wilsdon. Accord- 
ing to a study co-authored by Wilsdon and 
published in October 2013 by the innovation 

charity Nesta, based 


“Chinese in London, the state 
leaders would companies might 
like something block more-inventive 
equivalent toa small and medium- 
Nobel prize, or sized enterprises. 
aworld-class China, the study 
productsimilar —_ argues, is an “absorp- 


tive state”: one that 
adopts and adapts 
incoming technologies from overseas but does 
little breakthrough research. However, Wilsdon 
points to a few eye-catching bright spots: 
privately held, globally minded companies that 
include the telecommunications firms Huawei 
Technologies and ZTE, the e-commerce giant 
Alibaba and the computer firm Lenovo. 

China’s emphasis on applied and product- 
development research means that funding for 
basic science remains low: only 5% of the coun- 
try’s total R&D is devoted to this, compared 
with 15-20% in other major OECD nations. 
That money has to support a larger number of 
researchers who are already poorly paid, says 
Xue. Many academics, he says, complement 
their salaries by taking on short-term projects 
for industry — work that can distract their focus 
from fundamental science problems. 

Funding and evaluation systems suffer other 
distortions, says Cong Cao, a science-policy 
analyst at the University of Nottingham, UK. 
Grant money is not disbursed transparently, 
and basic-research funding tends to go to emi- 
nent scientists and safe projects, he says, with 
academics judged mechanically on the num- 
ber of publications that they author. A stag- 
gering rise in scientific output has not yet been 
matched by an equivalent rise in highly cited 
articles; swathes of patents are filed but rarely 
used. Wilsdon says that world-class research 
occurs at the country’s top 30 universities and 
at Chinese Academy of Science institutes. “But 
it is still very patchy, and a lot ofitis reliant ona 
relatively small number of outstanding scien- 
tists lured back from overseas,’ he says. 

Simon adds that China’s scientists need 
more independence and freedom to work 
on risky projects. Such changes might be on 
the way: Cao expects that at the forthcoming 
review of China's 2006 science plan, funding 
agencies will be told to be more transparent 
about their grants and grantees, and Chinese 
researchers will be allowed to use more of their 
funding to boost the salaries of research staff. 

One of the plan’s paramount goals seems 
to be right on target, however: China, unlike 
Europe, looks set to boost its research spending 
to 2.5% of GDP by 2020. m 
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THE LEARNING MACHINES 


Using massive amounts of data to recognize photos and speech, 
deep-learning computers are taking a big step towards true artificial intelligence. 


BY NICOLA JONES 


hree years ago, researchers at the 

secretive Google X lab in Mountain 

View, California, extracted some 

10 million still images from YouTube 

videos and fed them into Google Brain 

—anetwork of 1,000 computers pro- 
grammed to soak up the world much as a 
human toddler does. After three days looking 
for recurring patterns, Google Brain decided, 
all on its own, that there were certain repeat- 
ing categories it could identify: human faces, 
human bodies and ... cats’. 

Google Brain’s discovery that the Inter- 
net is full of cat videos provoked a flurry of 
jokes from journalists. But it was also a land- 
mark in the resurgence of deep learning: a 
three-decade-old technique in which mas- 
sive amounts of data and processing power 
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help computers to crack messy problems that 
humans solve almost intuitively, from recog- 
nizing faces to understanding language. 

Deep learning itself is a revival of an even 
older idea for computing: neural networks. 
These systems, loosely inspired by the densely 
interconnected neurons of the brain, mimic 
human learning by changing the strength of 
simulated neural connections on the basis of 
experience. Google Brain, with about 1 mil- 
lion simulated neurons and 1 billion simu- 
lated connections, was ten times larger than 
any deep neural network before it. Project 
founder Andrew Ng, now director of the 
Artificial Intelligence Laboratory at Stanford 
University in California, has gone on to make 
deep-learning systems ten times larger again. 

Such advances make for exciting times in 
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artificial intelligence (AI) — the often-frus- 
trating attempt to get computers to think like 
humans. In the past few years, companies such 
as Google, Apple and IBM have been aggres- 
sively snapping up start-up companies and 
researchers with deep-learning expertise. 
For everyday consumers, the results include 
software better able to sort through photos, 
understand spoken commands and translate 
text from foreign languages. For scientists and 
industry, deep-learning computers can search 
for potential drug candidates, map real neural 
networks in the brain or predict the functions 
of proteins. 

“Al has gone from failure to failure, with bits 
of progress. This could be another leapfrog,’ 
says Yann LeCun, director of the Center for 
Data Science at New York University and a 
deep-learning pioneer. 

“Over the next few years we'll see a feeding 
frenzy. Lots of people will jump on the deep- 
learning bandwagon,’ agrees Jitendra Malik, 
who studies computer image recognition at 
the University of California, Berkeley. But 
in the long term, deep learning may not win 
the day; some researchers are pursuing other 
techniques that show promise. “I’m agnostic,” 
says Malik. “Over time people will decide what 
works best in different domains.” 


INSPIRED BY THE BRAIN 

Back in the 1950s, when computers were new, 
the first generation of AI researchers eagerly 
predicted that fully fledged AI was right 
around the corner. But that optimism faded as 
researchers began to grasp the vast complexity 
of real-world knowledge — particularly when 
it came to perceptual problems such as what 
makes a face a human face, rather than a mask 
ora monkey face. Hundreds of researchers and 
graduate students spent decades hand-coding 
rules about all the different features that com- 
puters needed to identify objects. “Coming up 
with features is difficult, time consuming and 
requires expert knowledge,” says Ng. “You have 
to ask if there's a better way.” 

In the 1980s, one better way seemed to be 
deep learning in neural networks. These sys- 
tems promised to learn their own rules from 
scratch, and offered the pleasing symmetry 
of using brain-inspired mechanics to achieve 
brain-like function. The strategy called for 
simulated neurons to be organized into sev- 
eral layers. Give such a system a picture and 
the first layer of learning will simply notice all 
the dark and light pixels. The next layer might 
realize that some of these pixels form edges; 
the next might distinguish between horizon- 
tal and vertical lines. Eventually, a layer might 
recognize eyes, and might realize that two eyes 
are usually present ina 


human face (see ‘Facial DNATURE.COM 
recognitiom). Learn about another 
The first deep-learn- approach to brain- 
ing programs did not _ likecomputers: 
perform any better than —_go.nature.com/fktnso 


simpler systems, says Malik. Plus, they were 
tricky to work with. “Neural nets were always 
a delicate art to manage. There is some black 
magic involved,” he says. The networks needed 
arich stream of examples to learn from — like 
a baby gathering information about the world. 
In the 1980s and 1990s, there was not much 
digital information available, and it took too 
long for computers to crunch through what 
did exist. Applications were rare. One of the 
few was a technique — developed by LeCun — 


“OVER THE NEXT FEW YEARS 
WE'LL SEE A FEEDING FRENZY. 
LOTS OF PEOPLE WILL JUMP 
ON THE DEEP-LEARNING 
BANDWAGON.” 


that is now used by banks to read handwritten 
cheques. 

By the 2000s, however, advocates such as 
LeCun and his former supervisor, computer 
scientist Geoffrey Hinton of the University 
of Toronto in Canada, were convinced that 
increases in computing power and an explo- 
sion of digital data meant that it was time fora 
renewed push. “We wanted to show the world 
that these deep neural networks were really 
useful and could really help,’ says George Dahl, 
a current student of Hinton’. 

As astart, Hinton, Dahl and several others 
tackled the difficult but commercially impor- 
tant task of speech recognition. In 2009, the 
researchers reported’ that after training on 
a classic data set — three hours of taped and 
transcribed speech — their deep-learning neu- 
ral network had broken the record for accuracy 
in turning the spoken word into typed text, a 
record that had not shifted much in a decade 
with the standard, rules-based approach. The 
achievement caught the attention of major 
players in the smartphone market, says Dahl, 
who took the technique to Microsoft during 
an internship. “In a couple of years they all 
switched to deep learning” For example, the 
iPhone's voice-activated digital assistant, Siri, 
relies on deep learning. 


GIANT LEAP 
When Google adopted deep-learning-based 
speech recognition in its Android smartphone 
operating system, it achieved a 25% reduction 
in word errors. “That’s the kind of drop you 
expect to take ten years to achieve,’ says Hin- 
ton — a reflection of just how difficult it has 
been to make progress in this area. “That’s like 
ten breakthroughs all together” 

Meanwhile, Ng had convinced Google to let 
him use its data and computers on what became 
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Google Brain. The project's ability to spot cats 
was a compelling (but not, on its own, commer- 
cially viable) demonstration of unsupervised 
learning — the most difficult learning task, 
because the input comes without any explana- 
tory information such as names, titles or 
categories. But Ng soon became troubled that 
few researchers outside Google had the tools to 
work on deep learning. “After many of my talks, 
he says, “depressed graduate students would 
come up to me and say: ‘T don't have 1,000 com- 
puters lying around, can I even research this?” 

So back at Stanford, Ng started develop- 
ing bigger, cheaper deep-learning networks 
using graphics processing units (GPUs) — the 
super-fast chips developed for home-computer 
gaming’. Others were doing the same. “For 
about US$100,000 in hardware, we can build 
an 11-billion-connection network, with 
64 GPUs,’ says Ng. 


VICTORIOUS MACHINE 

But winning over computer-vision scientists 
would take more: they wanted to see gains on 
standardized tests. Malik remembers that Hin- 
ton asked him: “You're a sceptic. What would 
convince you?” Malik replied that a victory in 
the internationally renowned ImageNet com- 
petition might do the trick. 

In that competition, teams train computer 
programs on a data set of about 1 million 
images that have each been manually labelled 
with a category. After training, the programs 
are tested by getting them to suggest labels 
for similar images that they have never seen 
before. They are given five guesses for each test 
image; if the right answer is not one of those 
five, the test counts as an error. Past winners 
had typically erred about 25% of the time. In 
2012, Hinton’s lab entered the first ever com- 
petitor to use deep learning. It had an error rate 
of just 15% (ref. 4). 

“Deep learning stomped on everything else,” 
says LeCun, who was not part of that team. The 
win landed Hinton a part-time job at Google, 
and the company used the program to update 
its Google+ photo-search software in May 2013. 

Malik was won over. “In science you have 
to be swayed by empirical evidence, and this 
was clear evidence,’ he says. Since then, he has 
adapted the technique to beat the record in 
another visual-recognition competition’. Many 
others have followed: in 2013, all entrants to 
the ImageNet competition used deep learning. 

With triumphs in hand for image and speech 
recognition, there is now increasing interest in 
applying deep learning to natural-language 
understanding — comprehending human 
discourse well enough to rephrase or answer 
questions, for example — and to translation 
from one language to another. Again, these are 
currently done using hand-coded rules and 
statistical analysis of known text. The state- 
of-the-art of such techniques can be seen in 
software such as Google Translate, which can 
produce results that are comprehensible (if 
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sometimes comical) but nowhere near 
as good as asmooth human translation. 
“Deep learning will have a chance to do 
something much better than the cur- 
rent practice here,’ says crowd-sourcing 
expert Luis von Ahn, whose company 
Duolingo, based in Pittsburgh, Penn- 
sylvania, relies on humans, not com- 
puters, to translate text. “The one thing 
everyone agrees on is that it’s time to try 
something different.” 


DEEP SCIENCE 

In the meantime, deep learning has 
been proving useful for a variety of 
scientific tasks. “Deep nets are really 
good at finding patterns in data sets,” 
says Hinton. In 2012, the pharmaceuti- 
cal company Merck offered a prize to 
whoever could beat its best programs 
for helping to predict useful drug can- 
didates. The task was to trawl through 
database entries on more than 30,000 
small molecules, each of which had 
thousands of numerical chemical-prop- 
erty descriptors, and to try to predict 
how each one acted on 15 different tar- 
get molecules. Dahl and his colleagues 
won $22,000 with a deep-learning sys- 
tem. “We improved on Merck's baseline 
by about 15%, he says. 

Biologists and computational 
researchers including Sebastian Seung 
of the Massachusetts Institute of Tech- 
nology in Cambridge are using deep 
learning to help them to analyse three- 
dimensional images of brain slices. Such 
images contain a tangle of lines that rep- 
resent the connections between neu- 
rons; these need to be identified so they can be 
mapped and counted. In the past, undergradu- 
ates have been enlisted to trace out the lines, 
but automating the process is the only way to 
deal with the billions of connections that are 
expected to turn up as such projects continue. 
Deep learning seems to be the best way to auto- 
mate. Seung is currently using a deep-learning 
program to map neurons ina large chunk of the 
retina, then forwarding the results to be proof- 
read by volunteers in a crowd-sourced online 
game called EyeWire. 

William Stafford Noble, a computer scien- 
tist at the University of Washington in Seattle, 
has used deep learning to teach a program to 
look at a string of amino acids and predict the 
structure of the resulting protein — whether 
various portions will form a helix or a loop, for 
example, or how easy it will be for a solvent to 
sneak into gaps in the structure. Noble has so 
far trained his program on one small data set, 
and over the coming months he will move on to 
the Protein Data Bank: a global repository that 
currently contains nearly 100,000 structures. 

For computer scientists, deep learning 
could earn big profits: Dahl is thinking about 
start-up opportunities, and LeCun was hired 


FACIAL RECOGNITION 


Deep-learning neural networks use layers of increasingly 
complex rules to categorize complicated shapes such as faces. 


Layer 1: The 
computer 
identifies pixels 
of light and dark. 


Layer 2: The 
computer learns to 
identify edges and 
simple shapes. 


objects. 


last month to head a new AI department at 
Facebook. The technique holds the promise 
of practical success for AI. “Deep learning 
happens to have the property that if you feed it 
more data it gets better and better,’ notes Ng. 
“Deep-learning algorithms aren’ the only ones 
like that, but they’re arguably the best — cer- 


“DEEP LEARNING HAS THE 
PROPERTY THAT IF YOU 
FEED IT MORE DATA, IT GETS 
BETTER AND BETTER.” 


tainly the easiest. That's why it has huge prom- 
ise for the future.” 

Not all researchers are so committed to the 
idea. Oren Etzioni, director of the Allen Insti- 
tute for Artificial Intelligence in Seattle, which 
launched last September with the aim of devel- 
oping AI, says he will not be using the brain for 
inspiration. “Its like when we invented flight” he 
says; the most successful designs for aeroplanes 
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Layer 3: The computer 
learns to identify more 
complex shapes and 


Layer 4: The computer 
learns which shapes 
and objects can be used 
to define a human face. 


were not modelled on bird biology. Etzi- 
oni’s specific goal is to invent a computer 
that, when given a stack of scanned text- 
books, can pass standardized elemen- 
tary-school science tests (ramping up 
eventually to pre-university exams). To 
pass the tests, a computer must be able 
to read and understand diagrams and 
text. How the Allen Institute will make 
that happen is undecided as yet — but for 
Etzioni, neural networks and deep learn- 
ing are not at the top of the list. 

One competing idea is to rely on a 
computer that can reason on the basis 
of inputted facts, rather than trying to 
learn its own facts from scratch. So it 
might be programmed with assertions 
such as ‘all girls are people. Then, when 
it is presented with a text that mentions 
a girl, the computer could deduce that 
the girl in question is a person. Thou- 
sands, if not millions, of such facts are 
required to cover even ordinary, com- 
mon-sense knowledge about the world. 
But it is roughly what went into IBM’s 
Watson computer, which famously 
won a match of the television game 
show Jeopardy against top human com- 
petitors in 2011. Even so, IBM’s Watson 
Solutions has an experimental interest 
in deep learning for improving pattern 
recognition, says Rob High, chief tech- 
nology officer for the company, which 
is based in Austin, Texas. 

Google, too, is hedging its bets. 
Although its latest advances in picture 
tagging are based on Hinton’s deep- 
learning networks, it has other depart- 
ments with a wider remit. In December 
2012, it hired futurist Ray Kurzweil to pursue 
various ways for computers to learn from 
experience — using techniques including but 
not limited to deep learning. Last May, Google 
acquired a quantum computer made by D- Wave 
in Burnaby, Canada (see Nature 498, 286-288; 
2013). This computer holds promise for non- 
AI tasks such as difficult mathematical com- 
putations — although it could, theoretically, be 
applied to deep learning. 

Despite its successes, deep learning is still in 
its infancy. “It’s part of the future,” says Dahl. 
“Tn a way it’s amazing we've done so much with 
so little” And, he adds, “we've barely begun”. m 


Nicola Jones is a freelance reporter based near 
Vancouver, Canada. 
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CAUGHT ON CAMPUS 


Violent incidents at academic institutions have spurred universities to adopt 
formal procedures designed to keep campuses safer. But do the tactics work? 


BY BRENDAN MAHER 


nmany pictures of her online, Kayla Bourque looks like a typical col- 

lege student: there are selfies of her on a coastal holiday, or smirking 
mischievously after an experiment in hair colour. But in 2012, Bour- 

que, then a criminology student at Simon Fraser University in British 
Columbia, told a classmate that she fantasized about killing a homeless 
person and that she was studying forensics so that she could get away with 
it. She also talked about killing her family pets and neighbourhood cats. 
The classmate told a teaching assistant what Bourque had said, and the 
department chair called campus security. This triggered a formal process 
called a threat assessment, in which security, university administrators 
and outside consultants gathered evidence and evaluated Bourque’ recent 
behaviour. They took the allegation seriously, says Stephen Hart, a forensic 
psychologist at Simon Fraser who advised on the case. “Often something 
like this is a cry for help,’ he says. But her actions on several occasions sug- 
gested that she might pose a threat to other students, so simply referring 
her to the university’s outpatient mental-health services would not suffice. 
The team notified the local police, and told Bourque that she would not be 
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able to return to university without a thorough psychological evaluation. 

Then, while university employees were packing up her dorm room, 
they found what has been described in court documents as a ‘kill kit’: 
a bag containing a kitchen knife, a razor blade, latex gloves, a syringe 
and plastic ties — the kind used to restrain people. “They realized that 
this wasnt just a call for help,’ says Hart. The discovery led to a search 
warrant for her computer, on which police found violent pornogra- 
phy, disturbing artwork and more selfies, including one of her standing 
naked next to her disembowelled dog, Molly. 

Bourque spent nine months in custody in 2012 for killing Molly, as 
well as her cat Snowflake, and for possession of a weapon. When she 
was released it was with an impressive list of probationary conditions, 
including not using the Internet unsupervised, informing anyone she 
interacts with about her crimes, never owning a pet, and staying away 
from Simon Fraser. As horrifying as the case is, Hart sees it as a major 
triumph for the growing field of threat assessment. 

Although they are exceedingly rare, the number of violent incidents 
reported on college and university campuses has been increasing. 
Recently, academic institutions have served as the backdrop to a series 
of highly publicized attacks — and sometimes scientists are the central 
figures. In 2010, Amy Bishop, a biology professor, gunned down three 
fellow faculty members at the University of Alabama at Huntsville after 
being denied tenure (see Nature 465, 150-155; 2010). Two years later, 
James Holmes withdrew from his PhD studies in neuroscience at the 
University of Colorado Denver about a month before killing 12 people 
and wounding 58 at a cinema in Aurora. 

In many cases, such events are preceded by an escalation in threatening 
or aberrant behaviour. Holmes had told a university psychiatrist that he 
fantasized about killing, and Bishop's behaviour allegedly prompted a 
dean and provost at the university to request police protection months 
before the attack. Both of these cases are subjects of pending wrongful- 
death lawsuits. But could the attacks have been prevented? 

That is the goal of threat assessment, in which organizations adopt 
formal procedures to identify and mitigate a dangerous situation before 
it explodes into violence. Threat-assessment teams and plans are becom- 
ing standard at colleges and univer- 
sities in the United States, and are 
mandatory in some states, includ- 
ing Virginia, Connecticut and Illi- 
nois. Other countries are following 
suit. “The biggest push we've seen 
has been in higher education,” says 
Marisa Randazzo, a social psycholo- 
gist and former US Secret Service 
agent who works with a threat- 
assessment consulting firm. 

It is difficult to prove that the tactics work, and there are concerns 
that they may tread on civil liberties, but many see threat assessment as 
anecessary part of emergency preparedness. “We live in one of the most 
violent societies,” says Reid Meloy, a forensic psychologist at the Univer- 
sity of California, San Diego. “Anything we can do to mitigate risk is of 
value and something important to consider” 


ROOTS OF VIOLENCE 

Talk to anyone in the threat-assessment field about violence on col- 
lege campuses, and Gene Deisinger’s name will inevitably come up. 
Deisinger was clinical director of the counselling centre at Iowa State 
University in Ames when the institution decided to build a threat- 
assessment team. A number of events influenced that decision: in 1986, 
a former computer-science student set fire to the house of one of his 
professors, killing two of the professor's children. 


Then, in 1991, a young physicist at the University NATURE.COM 
of Iowa in Iowa City killed five people and himself, For apodcast and 
reportedly because he had been passed over fora _ interview with Gene 
thesis prize. In response to these and some other _Deisinger, see: 
incidents, Loras Jaeger, then chief of Iowa State’s _go.nature.com/qxjgi4 


“WE LIVE IN ONE OF THE MOST VIOLENT 
SOCIETIES. ANYTHING WE CAN DO 
TO MITIGATE RISK IS OF VALUE AND 
SOMETHING IMPORTANT TO CONSIDER.” 
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campus police department, asked Deisinger if he would help to create a 
threat-management team. “I didn’t know what one was, and so I started 
researching,” Deisinger says. 

The US Secret Service — tasked with protecting the president and 
other public officials — had a long history of developing methods to 
assess threats, but that work was not publicly available at the time. Deis- 
inger turned to research on dangerous behaviour, workplace violence 
and dealing with students in crisis. From that work, he developed an 
approach for identifying behavioural concerns and intervening in a 
campus setting. He had a team up and running by 1994. The duties grew 
quickly, he says, so Jaeger created a full-time position for him within 
the university police department. During the first month, Jaeger asked 
Deisinger to report to the state police academy for training, something 
the 33-year-old psychologist was not eager to do at that stage in his 
career. But he is glad that he relented. As “a psychologist that carries a 
badge and a gun, a lot of doors open up’, he says. 

A campus threat-assessment team is interdisciplinary, and includes 
law-enforcement professionals, psychologists, academic administrators, 
representatives from student services and human resources and legal 
counsel. When someone reports a suspicious behaviour, such as a threat 
from a student, the team often starts by confronting the person about 
the behaviour. They may talk to peers, advisers and teachers. 

By studying past attacks through the lens of psychology, researchers 
have identified a range of behaviours and environmental factors that 
may conspire to trigger violence. Individuals may exhibit extreme or 
sudden changes in behaviour, alienate themselves or others, or adopt 
unhealthy interests in weapons or violent acts. Environmental factors 
may include a tolerance to aggressive interactions in a workplace, an 
unresolved conflict, or the existence of cliques or pecking orders. And 
there are often precipitating events. These could be personal conflicts 
or work-life pressures — such as not getting tenure or a key grant — that 
an individual has had trouble dealing with appropriately. 

Empirical data on attacks suggest that there is a ‘pathway to violence’: 
there may be some form of grievance, the development of an inten- 
tion to do harm, then research, planning and preparation. Bourque, 
for example, told psychologists that 
she had been taking a bus into town 
to check out potential victims, and 
her ‘kill kit’ suggested an advanced 
stage of preparation. But the data 
can tell only so much, because such 
attacks are rare. There are no simple 
checklists and no simple profile of 
an attacker. 

If concerns are legitimate and a 
threat-assessment team decides that 
a person may be on a trajectory towards violence, then the group works 
to manage the threat, often by putting the individual in touch with sup- 
port or mental-health services, or by working out a way to resolve the 
environmental factors contributing to the situation. Team members may 
make regular visits, or what Deisinger describes as ‘coffee dates. These 
are meant to help to keep an eye on people, and unless the person has 
violated the law or a university rule they are considered voluntary. Most 
of them are cordial: “I’ve established a rapport with you that you would 
at least allow me through the front door, which would allow me to see 
your living circumstances,” says Deisinger. “Is there evidence of psychotic 
deterioration? Are there weapons stacked in the corner? Are you taking 
care of yourself? Is there food there? Is your hygiene intact?” 

It all sounds a bit Big Brother-ish, and Deisinger doesn’t pretend 
otherwise. These individuals, he says, “know that I’m doing this. We don't 
play games with it, most of the time.” Still, he says, it can be surprising how 
open people often are. Bourque “talked freely” to the threat-assessment 
team that evaluated her, says Hart, and even gave them permission to pack 
up her residence, which was where they spotted the ‘kill kit. 

Threat assessment works only when people have signalled an intent 
to do harm. Luckily, these signals often appear. In the 1990s, the Secret 
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DANGEROUS TREND 


A study of attacks reported on campuses in the United States since 1900 showed 
that the increasing frequency of such incidents largely followed rising student 
numbers, but enrolment alone may not fully explain the increase. 


Student enrolment and number of incidents 
of directed assaults between 1900 and 2009 
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Service looked at 83 individuals who had attacked or come close to 
attacking a prominent public official or public figure. It showed that 63% 
had communicated some sort of threat in advance, although rarely to 
the intended target’. “The people who carry out these acts, they typically 
tell someone what they're planning to do,” says Randazzo. “We've seen 


many cases where they broadcast it on social media.” 


20/20 HINDSIGHT 

In November 2005, campus security at Virginia Polytechnic Institute and 
State University in Blacksburg received a report about Seung-Hui Cho, a 
South Korean-born student studying English who was allegedly harass- 
ing a female student. A room-mate added that Cho had made comments 
about contemplating suicide. Cho was assessed three times and said that 
the suicidal statements were a joke. He was briefly hospitalized. 

In February and March of 2007, Cho bought two pistols; in April, he 
killed 32 people in what is, so far, the deadliest campus shooting in history. 

The incident launched a national study on campus attacks by the Secret 
Service, the US Federal Bureau of Investigation and the US Department 
of Education. It collected information on 272 incidents in the United 
States between 1900 and 2008. The study showed that such events are 
rare — but that they are increasing in frequency’. For instance, the report 
catalogues only 25 incidents between 1970 and 1979, but 83 between 
2000 and 2008 (see ‘Dangerous trend’). The rise in student numbers 
is certainly a factor in the increase, and it is likely that more incidents 
are being reported than in the past. But those changes might not fully 
account for the trend, says Andre Simons, a behavioural scientist with 
the FBI who is working on a follow-up to the report. 

The study also revealed the varied nature of attacks occurring in a 
university setting. Students or former students accounted for 60% of 
the perpetrators, and another 11% were employees. But the remaining 
29% were not official members of the campus. 

It is hard to say whether the individuals posed clear threats before 
their attacks. Nearly 30% of perpetrators displayed threatening behav- 
iour, such as stalking or harassing, making verbal or written threats, or 
being physically aggressive towards their targets. Another 20% exhibited 
at least some sort of concerning behaviour to a friend, family member, 
work associate or police officer. But such behaviours could be vague and 
general, and were not always reported. 

In Chos case, there were signs of aberrant behaviour, but no process 
was in place to follow him in an exhaustive way, says Deisinger. After 
the shooting, Deisinger consulted with Virginia Tech to help build a 
threat-assessment team there, and he was eventually hired as the univer- 
sity’s chief of police. The Virginia Tech shooting was a pivotal case that 
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spurred more universities to develop such teams, says Meloy, although 
how many exist is unclear. A self-reported survey from 2012 found that 
92% of universities and community colleges in the United States have 
some sort of team in place’, but it included other kinds of behavioural- 
intervention teams that do not typically work directly with the police. 
The trend is not limited to the United States: universities in Australia 
have increasingly been taking an interest in threat-assessment proce- 
dures; and an estimated ten universities in German-speaking countries 
have established or developed plans for teams, says Jens Hoffmann, a 
psychologist at the Institute of Psychology and Threat Management in 
Darmstadt, Germany, and co-editor with Meloy of the International 
Handbook of Threat Assessment (Oxford Univ. Press, 2014). 


SAFE AND SOUND? 

These systems and tactics for managing threats cannot stamp out all 
targeted violence. In 2009, after Virginia Tech had established its threat- 
assessment team but before Deisinger had arrived, a graduate student in 
agricultural economics beheaded a woman who had rebuffed his roman- 
tic advances. And a team was in place at the University of Colorado where 
Holmes made threats to a mental-health professional, but Holmes left 
the university shortly after, making it difficult to follow up on the case. 

There are bound to be missed signals, says Deisinger. And threat 
assessment is only as good as the vigilance of a community, because it 
relies heavily on reporting. But in the wake of recent high-profile shoot- 
ings, this vigilance has improved, says Randazzo. “People are reporting 
things that seem not right in ways that they didn't in the past.” Many 
cite the “see something, say something” campaigns that have blanketed 
New York City since the terrorist attacks of 11 September 2001 as having 
helped to encourage people to report suspicious behaviour. But a team 
is useless if no one knows about it, so Deisinger and others have tried to 
spread the word by creating websites and fliers, as well as holding train- 
ing sessions on how to deal with inappropriate behaviour. 

There are costs to all this watchfulness, however, says Joe Cohn, legisla- 
tive and policy director for the Foundation for Individual Rights in Edu- 
cation in Philadelphia, Pennsylvania. “It's not unusual for universities to 
engage in behaviours that chill freedom of speech in the name of safety,” 
he says. He cites recent examples in which a student was expelled for 
protesting over the construction of a parking garage and a professor was 
reported to a threat-assessment team for hanging posters with aggressive 
messages outside his office. He urges teams to include civil libertarians to 
better ensure that universities do not encroach on people’s rights. 

It is also difficult to prove that having a threat-assessment team makes 
a campus any safer. There are no standards for how to report a success- 
ful case, and privacy concerns make the sharing of data complicated. 
Deisinger says that his team tracks cases to see whether interventions 
have improved the situation for the individual and the people around 
him or her. “Most of them, we can resolve to a level that is akin to the 
day-to-day moderately inappropriate behaviour,’ he says. It is not ideal, 
“but it’s liveable’, he adds. The field is trying to set standards and collect 
data on how well the process works. Phase two of the FBI’s campus- 
attacks study, which will focus on attacks that happened between 1985 
and 2010, may fill in some of these holes. 

These are all concerns that weigh heavily on Deisinger. But what wor- 
ries him most is the thought that someone, somewhere, is planning some- 
thing that no one can anticipate. “I’m often asked, because of the cases we 
work, ‘How do you sleep at night?” He does not worry so much about 
people who have already been identified by his team. “It’s the cases I don't 
know about,’ he says, “that give me difficulty sleeping.” m SEE EDITORIAL P. 131 


Brendan Maher is a features editor for Nature based in New York. 
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Remote control of drones is one of the many contributions science and technology have made to war. 


MILITARY SCIENCE 


The evolving science of war 


Sharon Weinberger assesses two studies probing the roles of physics and psychology 
in conflicts past, present and future. 


arry Parker’s chronicle of the inter- 
B play between the military and science, 
The Physics of War, is largely a record 
of people developing more effective ways to 
kill each other. So it is poignant that Parker, a 
physicist, begins the book with a passage on 
a battle that took place more than 3,000 years 
ago in what is now Syria, a country in the mid- 
dle of a bloody civil war threatening to draw 
in world powers. It seems that fundamentals 
of warfare have not changed, but with the 
advent of science and the creation of more- 
powerful weapons, the stakes are now higher. 
Physics, Parker argues, has enabled much 
of the killing. For thousands of years people 
have used its principles to build increasingly 
powerful weapons, even before they under- 
stood what made the devices work. Weapon 
by weapon, and in excruciating detail, Parker 
shows how a mix of tinkering, basic maths 
and physics — including, much later, nuclear 
physics — enabled the development of weap- 
ons of war, from the chariots of ancient Syria 
to modern thermonuclear weapons. 
That is a lot of ground to cover, and Park- 
er’s book is best read as a primer for those 


interested in the science of weapons and 
their contributions to various battles. It is on 
less solid ground in helping us to understand 
when military leaders realized that advanc- 
ing science as a discipline could aid warfare. 
At one point, for example, Parker writes that 
“Napoleon studied physics along with mathe- 
matics and astronomy in military school, and 
knew the importance of science to war”. In the 
same paragraph, however, he states that there 
is “no indication” that Napoleon “took a lot 
of interest in physics, or science, in general”. 
By contrast, Michael Matthews lively and 
engaging Head Strong makes the weighty 
argument that psychology is emerging as 
the science that will make the difference 
in twenty-first-century warfare. War is 
not just about killing, he argues; it is about 
understanding the enemy, and ourselves. 
Matthews, a military psychologist, makes 
a valiant case, noting how psychology has 
contributed to everything from selecting 
leaders to helping soldiers navigate foreign 
cultures. He predicts that it will one day help 
to produce drugs “capable of regulating the 
brain’s response to combat stress”, perhaps 
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The Physics of War: From Arrows to Atoms 
BARRY PARKER 
Prometheus Books: 2014. 


Head Strong: Psychology and Military 
Dominance in the 21st Century 
MICHAEL D. MATTHEWS 

Oxford University Press: 2013. 


eliminating post-traumatic stress disorder. 
When Matthews writes about his own 
research on the psychology of soldier per- 
formance and leadership, or his experience 
as a professor at the United States Military 
Academy at West Point in New York, the book 
springs to life. He shows how psychological 
methods have challenged some of the military 
leadership's entrenched beliefs about gender, 
citing a study he was involved in that surveyed 
Air Force base commanders’ attitudes about 
women. Almost every commander told a 
story of how the pilot of a crashed, burning 
aeroplane died because a female firefighter 
was not strong enough to carry him out. The 
story, Matthews later found, was apocryphal. 
His larger point is how science, particu- 
larly psychology, can inform decisions about 
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integration. In another example, he notes 
that West Point, which trains officers, tar- 
gets women’s enrolment at about 15% to 
reflect the ratio of women in the military. 
That sounds noble; but he notes that West 
Point tries (and has so far failed) to recruit 
African Americans at a rate reflecting 
their representation in the recruiting-age 
population. Were the same rule applied 
to women, he writes, they should make 
up half the class. West Point spokesman 
Francis DeMaro declined to comment on 
goals linked to gender or ethnicity, instead 
providing numbers on the most recent 
entering class (16% women, 10% African 
Americans) that seem to bolster Mat- 
thews’s argument. “We strive to ensure 
our cadet population is representative of 
the soldiers they will lead,’ says DeMaro. 

Matthews stumbles a bit when talk- 
ing about the importance of psychology 
in understanding foreign cultures. He 
praises the Human Terrain System, the 
well-intentioned but troubled US pro- 
gramme that embeds social scientists 
into teams that deploy with the military 
(see Nature http://doi.org/bxmgsw; 2011). 
Matthews engages in the same kind of 
oversimplification of cultural knowledge 
that underlies the problems facing these 
teams. He recalls howa US military com- 
mander in Iraq learned that arriving heav- 
ily armed at meetings with community 
leaders was a “major social blunder” (as 
it might be, of course, in most cultures). 

By focusing on the progression of weap- 
ons, Parker misses the point at which 
physics was overtaken by other fields, 
including psychology, as disciplines cru- 
cial to warfare. But Matthews, in focusing 
so closely on current and future applica- 
tions of psychology, omits mention of one 
of most important military psychologists. 

In the 1960s, the US Department of 
Defense’s Advanced Research Projects 
Agency hired psychologist J. C. R. Lick- 
lider to create a behavioural sciences 
office. It was his unique insights into 
how man would interact with machine 
in the future that laid the foundation for 
ARPANET, the precursor to the Internet. 
Today, networked computers are as key to 
military command and control as they are 
to modern society. It could be argued that, 
thanks to Licklider, military psychology 
has already revolutionized war. Whether 
it will help the United States to win future 
wars is another matter. m 


Sharon Weinberger is a Global Fellow 
at the Woodrow Wilson International 
Center for Scholars in Washington DC. 
Her book about the Defense Advanced 
Research Projects Agency will be 
published in 2015. 

e-mail: sharonweinberger@gmail.com 
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Feeling the fear 


David Adam applauds the autobiography of a high-flyer 
confronting his own nervous suffering head-on. 


cott Stossel is, in his own words, a 
S “quivering, quaking, neurotic wreck”. 

Heis frightened of flying, vomiting and 
cheese. He has thrown tennis matches froma 
winning position just to get off the exposed 
stage of the court, and struggles to control 
his bowels. For three decades he has been a 
regular in the offices and clinics of psychia- 
trists, psychologists and psychoanalysts, and 
a testing ground for whatever treatment, 
drug or quack therapy they thought might 
bring some relief. 

Stossel is also a married father of two and 
editor of The Atlantic magazine. His terrific 
book My Age of Anxiety is his attempt to 
reconcile those two worlds, and offers an 
unsparing and unsentimental look at a sub- 
ject that many keep hidden: mental illness. 

Stossel suffers from anxiety, a condition 
that he identifies early on as tricky to define. 
Is anxiety the list of symptoms offered by 
psychiatrists? The biological response to 
threat that we share with animals? The 
social consequence of the shared knowledge 
of our mortality? Or the chemical conse- 
quence of misfiring neurotransmitters and 
brain circuitry? 

Books exploring personal experiences 
of mental illness tend to be either over- 
wrought accounts 
of personal trauma “Stosseloffers an 
that shed little light unsparing and 
onthe worldbeyond unsentimental 
the author’s nose,or look at a subject 
the more detached thatmany keep 
observations of hidden.” 
scientists and med- 
ics. It is rare to find works that bridge these 
objectives, which is one reason that the 
writer Andrew Solomon achieved such 
success with The Noonday Demon (Chatto 
& Windus, 2001), his personal and scien- 
tific account of depression. Stossel’s book 
deserves a place on this higher shelf. 

My Age of Anxiety covers all the aca- 
demic ground one would expect. We get the 
biological idea that anxiety is an unsuited 
modern deployment of an atavistic fight- 
or-flight physiological response to threat, 
the psychological basis for conditioned 
responses — that anxiety is a learned, if inap- 
propriate, fear — and the nascent attempts 
to link mind and body through brain scans 
and genetics. With help from some friendly 
neuroscientists, Stossel finds he has a variant 
of the SERT gene implicated in anxiety. 


Stossel is also aware 
of current contro- 
versies in psychiatry. 
He gives fair voice, 
for example, to both 
sides in the debate 
over the usefulness of 

j pharmaceuticals, talk 

therapies and the shift 

My Age of Anxiety: from viewing anxiety 
Fear, Hope, Dread, as a social and philo- 
ane ine Searen ror sophical issue to a dis- 


Peace of Mind : 
SCOTT STOSSEL order of chemical and 
Knopf: 2014. electrical signals. 


And he shows his 
skills as a writer with colourful and moving 
accounts of traumatic personal episodes. Asa 
child and adolescent he suffered extreme sep- 
aration anxiety and, aged 13, would wake the 
neighbours and ask them to call the police 
when his parents were out. The treatments 
were often equally grisly. Given an emetic 
syrup to make him vomit as exposure ther- 
apy to rid him of his phobia, he endures only 
hours of severe nausea and painful retching. 

Stossel addresses the heterogeneous 
ingredients of anxiety by trying to cover 
them all — as if a sense of completeness 
alone can bind them together. His policy of 
full disclosure may not always be to every- 
one’s tastes: an anecdote of a blocked toilet 
and a meeting with John F. Kennedy Jr, for 
one, feels gratuitous. But the approach also 
offers useful reminders of the human cost of 
taking strong positions on the use of drugs 
and other areas of scientific and medical 
uncertainty. Poised between a psychiatrist 
who puts him on drugs and a therapist who 
urges him to abandon them, Stossel finds 
himself lying to the therapist to spare her 
feelings when he returns to the psychiatrist. 

One of Stossel’s motives is the hope that 
the book might bring him peace. Still, he 
writes, “If it’s relief from nervous suffering 
that I crave, then burrowing into the his- 
tory and science of anxiety, and into my 
own psyche, is perhaps not the best way to 
achieve it.” 

We should all hope it works: the man is 
due a break. m 


David Adam is Nature’ Editorial and 
Columns editor. His first book, The Man 
Who Couldn't Stop: OCD and the True 
Story of a Life Lost in Thought, will be 
published in April 2014. 
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Physicists Edward Bowen (left), Lee DuBridge (centre) and I. |. Rabi work on a cavity magnetron in the 1940s. 


Shut up and calculate! 


Practical, interdisciplinary ways of working forged during the Second World War had 
a lasting impact on a generation of physicists and their findings, says David Kaiser. 


n 17 October 1940, Karl Compton, 
() president of the Massachusetts 

Institute of Technology (MIT) in 
Cambridge, made a hasty telephone call 
from Washington DC to a colleague back 
on campus. Could MIT spare some modest 
space to host an urgent, top-secret defence 
project? After making some quick assess- 
ments, Compton’s assistant reported that 
MIT could shuffle some other laboratories to 
accommodate the facility. With that phone 
call, the Radiation Laboratory, or ‘Rad Lab, 
was born. The laboratory had an enormous 
impact on the course of the Second World 
War. Arguably, its impact on science was 
even greater. 


Within weeks of Compton’s call, a skeleton 
staff at the Rad Lab was hard at work try- 
ing to improve on a British-designed cavity 
magnetron, which they hoped could become 
the centrepiece of a type of short-wavelength 
radar. When the laboratory began operation 
— more than a year before the United States 
entered the Second World War — the staff 
consisted of 20 physicists, three security 
guards, two stockroom clerks and a secre- 
tary. By the war's end, the lab had swollen 
to 4,000 people and was managing develop- 
ment contracts worth US$1.5 billion (nearly 
$20 billion in 2013 dollars)’. 

The Allied nuclear-weapons project, 
code-named the Manhattan Project, grew 


even faster. Again building on early insights 
from a British team, the Manhattan Project, 
coordinated from the Los Alamos laboratory 
in New Mexico, ballooned to encompass 
125,000 people working at 31 facilities across 
North America. By the time the atomic 
bombs were dropped on Hiroshima and 
Nagasaki in August 1945, the project had 
cost $1.9 billion (about $25 billion today)’. 
Together, the radar and atomic-bomb pro- 
jects amounted to about 1% of US military 
expenditure during the war: modest on the 
scale of defence appropriations, but utterly 
unprecedented for the academic scientists 
and engineers caught up in the war projects. 
And it was more than just the 
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budgets that grew. In both projects, 
physicists, chemists, metallurgists and their 
colleagues found themselves working in huge 
groups with larger-than-life equipment. Iso- 
tope-separation plants in Oak Ridge, Tennes- 
see, stretched the length of a city block; the 
nuclear-reactor facilities in Hanford, Wash- 
ington, required more than half a billion 
cubic metres of concrete. 

After the war, many physicists dismissed 
their work on such sprawling wartime pro- 
jects as temporary distractions: an important 
but limited hiatus from their ‘real’ scientific 
research. One Rad Lab veteran even com- 
posed a song soon after the war closing with 
the memorable line, “Oh, dammit! Engi- 
neering isn’t physics, is that plain? Take, oh 
take, your billion dollars, let’s be physicists 
again”? 

Despite the songwriter’s plea, scientists 
did not return to the antebellum status quo. 
Instead, many characteristics of the wartime 
projects became the new normal, even in 
peacetime. The war cast a long shadow on 
how science is organized and funded, and 
even on the methods and questions that many 
scientists pursued throughout their careers. 


COMMON PURPOSE 

Until the war, most scientific research in the 
United States had been supported by private 
foundations, local industries and under- 
graduate tuition fees. After the war, scien- 
tists experienced a continuity — even an 
expansion — of the wartime funding model. 
Almost all support for basic, unclassified 
research (as well as for mission-oriented 
defence projects) came from the federal 
government. 

In 1949, 96% of all funding for basic 
research in the physical sciences in the 
United States came from defence-oriented 
federal agencies, including the Department 
of Defense and the then-new Atomic Energy 
Commission, successor to the Manhat- 
tan Project. In 1954 — four years after the 
establishment of the civilian US National 
Science Foundation — 98% of funding for 
basic research in the physical sciences came 
from federal defence agencies. And the scale 
of funding was unlike anything before the 
war. By 1953, funding for basic research in 
the United States had leapt to 25 times what 
it had been in 1938 (in constant dollars, 
adjusting for inflation)*. The fire hose of 
federal spending paid for all kinds of inter- 
esting research. 

Much of the work was conducted in insti- 
tutions modelled on wartime examples. 
Defence projects during the war had thrown 
together experts from many different fields 
of science and engineering to work towards 
common objectives, rather than grouping 
specialists by disciplines. The enormous 
time pressures and shared goals of war 
work forced scientists and engineers to craft 


effective means of communicating with each 
other. Mathematical rigour and abstruse 
theoretical derivations were worth little if 
colleagues from other specialities could not 
build on the results. 

Veterans of the intense, multidisciplinary 
wartime projects came to speak of a new 
type of scientist. They touted the war-forged 
‘radar philosophy and the quintessential “Los 
Alamos mar: a pragmatist who could collab- 
orate with everyone from ballistics experts to 
metallurgists, and who had a gut feeling for 
the relevant phenomena without getting lost 
in philosophical niceties’. 

Leading scientists and policy-makers 
actively sought to continue the wartime 
spirit of collaboration across disciplines. 

The Atomic Energy 
Commission over- 
saw a new network 
of national laborato- 
ries to pursue both 
civilian and defence 
research. The labs 
featured interdiscipli- 
nary teams that mixed 
physicists, mathematicians and chemists 
with engineers of many stripes®. A similar 
set-up appeared across dozens of US univer- 
sities: facilities straddling several academic 
departments, such as the Research Labo- 
ratory for Electronics and the Laboratory 
for Nuclear Science and Engineering, both 
founded at MIT by the end of 1945 (ref. 7). 

The facilities hummed with surplus equip- 
ment and know-how culled from the wartime 
projects. Physicist Bruno Rossi, for one, stud- 
ied cosmic rays after the war by adapting the 
sensitive timing circuits he had built at Los 
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Julian Schwinger (standing) with colleagues at MIT’s Radiation Laboratory during the Second World War. 


€ 


Alamos to measure nuclear-fission rates’. 

Similarly, just months after the end of 
hostilities, self-described ‘boffins’ who had 
spent the war working on radar turned their 
attention to the impossibly small and the 
cosmically large. Some began to build radio 
telescopes and aimed them at the heavens. 
An international community coalesced, 
linking the Jodrell Bank telescopes near 
Manchester, UK, and the Parkes telescope 
in New South Wales, Australia, to similar 
instruments dotted across North America 
— from the California Institute of Tech- 
nology in Pasadena to the National Radio 
Astronomy Observatory in Green Bank, 
West Virginia®. And in 1947, using repur- 
posed microwave-frequency electronics left 
over from his wartime radar work, physicist 
Willis Lamb of Columbia University in New 
York measured a tiny shift — of about one 
part ina million — in the energy levels of an 
electron in the 2s and 2p orbitals ofa hydro- 
gen atom. Lamb’s remarkable achievement 
challenged physicists’ prevailing under- 
standing of the vacuum — the mysterious 
state of lowest-possible energy”. 

One of the first to hear about the Lamb 
shift was physicist Julian Schwinger, who 
before the war had been a rising star in 
quantum theory. Like so many physicists at 
the Rad Lab, Schwinger had been forced to 
rethink his approach to calculation. Elegant 
derivations from first principles — which 
often proved tractable only when applied 
to idealized situations — were of little value 
to the many colleagues who needed to fine- 
tune electronics components for maximum 
efficiency. Instead, as Schwinger himselflater 
recalled, he internalized from the engineers a 
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modular, ‘effective circuit’ approach. Rather 
than calculate the total electrical resistance 
of a complicated component from the lofty 
heights of Maxwell’s equations, he could 
‘blackbox’ each component, substituting its 
overall resistance as determined from meas- 
urements of inputs and outputs. The niceties 
of how current flowed between constituent 
parts of a given component mattered much 
less to the main objective — improving radar 
designs — than did the effect of that compo- 
nent ina given circuit”. 

Schwinger approached the Lamb shift 
with his Rad Lab lessons still fresh. Since the 
1930s, senior theorists had tried to calculate 
the effects of subtle quantum fluctuations 
from first principles. Maddeningly, their 
equations always broke down, producing 
unphysical infinities instead of finite answers. 
Schwinger rearranged his equations in terms 
of measurable inputs and outputs, just as his 
engineering colleagues at the Rad Lab had 
done with real-world electronics. By recast- 
ing the calculation, Schwinger managed to 
calculate the effects of quantum fluctuations 
on the electron’s energy levels and obtain an 
answer that matched Lamb’s measurement 
to an extraordinary precision. As it turned 
out, Japanese physicist Sin-Itiro Tomonaga 
had accomplished the same goal a few years 
earlier. Tomonaga’s work on radar during 
the war had proven similarly essential to his 
theoretical approach””. 


PHILOSOPHY RETURNS 

This war-forged pragmatism produced 
enormously impressive research and influ- 
enced a generation of leading scientists. 
Their approach to basic research — and 


Students protest against military research at the Massachusetts Institute of Technology in 1969. 
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the institutions in which they pursued it 
— assumed an aura of inevitability. But the 
approach came with some trade-offs, largely 
unnoticed at the time. Important questions 
that resisted the powerful, phenomenologi- 
cal methods tended to get eclipsed. Any- 
thing that smacked of ‘interpretation; or 
worse, ‘philosophy’, began to carry a taint 
for many scientists who had come through 
the wartime projects. Conceptual scrutiny 
of foundations struck many as a luxury. The 
wartime style was reinforced in the United 
States by exponentially rising university 
enrolments after the war. The new classroom 
realities left little space for informal discus- 
sion of philosophy or foundations. The Rad 
Lab rallying cry of “Get the numbers out” 
shaded into “Shut up and calculate!””® 

By the mid-1960s, three-quarters of each 
year’s crop of physics PhD graduates in the 
United States specialized in either nuclear 
physics or solid-state physics: two impor- 
tant and interesting areas, to be sure, but also 
those most readily funded by defence agen- 
cies (even for unclassified, basic research). 
They were also areas in which most physi- 
cists came to agree that a pragmatic style 
could yield the greatest success. During this 
period, for example, physicists first under- 
stood the nuclear force that causes radio- 
activity, and conquered strange phenomena 
such as superconductivity — both Nobel- 
prizewinning achievements. 

Openly philosophical areas of physics, 
the intellectual roots of which stretched 
back before the war, became increasingly 
marginalized, such as grand questions 
about the birth and fate of the Universe, the 
thin border between order and disorder in 


chaotic systems, or the subtle foundations 
of quantum theory. Sometimes these were 
denigrated as not even being ‘real physics’ 
by influential physicists in the United States, 
although research in these areas advanced in 
other parts of the world”. 

A quarter of a century after the end of the 
Second World War, cracks in the system 
began to show. The escalation of fighting 
in Vietnam made many people question 
the dominant place of military funding on 
university campuses, and difficult economic 
conditions further drove a rapid reversal of 
fortunes in the sciences, and in physics in 
particular. Job opportunities for those with 
science PhDs fell sharply, and university 
enrolments quickly followed suit, none more 
drastically than in physics. 

The organization, funding and basic 
approach to research that had come to 
seem normal — even inevitable — after 
the war were no longer taken for granted. 
Complementary styles of research began 
to creep back in, and growing numbers of 
physicists turned to topics that had seemed 
beyond the pale just a few years earlier, such 
as cosmology, chaos theory and quantum 
entanglement”. 

Radar philosophy and the Los Alamos 
man did not disappear from view. To this 
day, most basic research in the United States 
depends on federal funding, and many of 
the great successes of the postwar genera- 
tion — suchas the standard model of parti- 
cle physics — remain mainstays of research 
and teaching. But that legacy now sits beside 
more recent breakthroughs born of the era 
that reclaimed more openly speculative 
and philosophical approaches to the deep 
mysteries of nature. m 


David Kaiser is professor of the history 

of science and department head for the 
Program in Science, Technology, and Society 
at the Massachusetts Institute of Technology, 
Cambridge, Massachusetts. 

e-mail: dikaiser@mit.edu 
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BOOKS & ARTS 


Studies of birds such as the woodpecker finch (Camarhynchus pallidus) have aided a range of fields. 


ORNITHOLOGY 


Under their wing 


Ben Sheldon relishes a study of the broad-ranging impact 
of ornithology on modern biology. 


r | Vhe study of wild birds has had a 
disproportionate impact on the birth 
and evolution of several branches 

of science. Thus argue Tim Birkhead, Jo 

Wimpenny and Bob Montgomerie in Ten 

Thousand Birds. 

In chronicling the development of orni- 
thology over the past 150 years, the authors 
face the challenge of encapsulating a broad 
and diffuse field. Just how broad can be 
seen in the questions it tackles today, which 
span global-scale macroecology of all birds, 
detailed individual-level behavioural vari- 
ation, the physiology of migration and the 
genomics of speciation. Birkhead and his 
colleagues — all behavioural ecologists — 
eschew both the obvious chronological 
approach and the option of presenting potted 
scientific biographies of ‘great ornithologists: 
Instead they pick, from the 380,000 papers on 
birds put out since Charles Darwin published 

On the Origin of Spe- 
cies, 11 areas in which 


For more on avian research has illu- 
ornithology, see: minated broader ques- 
tions in biology. 


Ten Thousand 
Birds: Ornithology 
Since Darwin 

TIM BIRKHEAD, 

JO WIMPENNY AND 
BOB MONTGOMERIE 
Princeton University 

j Press: 2014. 


Birkhead, Wimpenny and Montgomerie 
hope to stimulate debate over which areas 
ornithology has contributed to most, but the 
ones they have chosen will surely appear on 
most lists. They range from palaeontology, 
speciation and systematics to physiology and 
comparative anatomy, by way of ethology, 
behavioural ecology and conservation. 

Research on wild birds has been key to 
understanding population dynamics, thanks 
to the ease of marking and identifying indi- 
viduals. Swedish ecologist Malte Andersson's 
1980s field experiments on the long-tailed 
widowbird (Euplectes progne) — in which 
longer tails in males were found to endow 
increased breeding success — ushered in a 
new phase of research into sexual selection. 
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Similarly, birds have been inspirational 
subjects for studies of natural selection ever 
since US biologist Hermon Bumpus found 
a link between morphology and individual 
survival in a flock of sparrows caught in 
an 1898 Rhode Island snowstorm. Even 
research on phenomena found only in birds 
offers general scientific insights: for instance, 
the elucidation of the mechanism behind the 
magnetic sense in avian navigation serves as 
a model for sensory physiology. 

This is a serious book that manages to be 
compulsively readable. The authors are at 
their most vivid when offering vignettes of 
individuals or specific events from the mod- 
ern history of ornithology: pivotal moments 
of discovery or the presentation of new ideas. 
Some are familiar. We have British evolu- 
tionary biologists David Lack and then Peter 
and Rosemary Grant successively exploring 
the evolution of Darwin’s finches from the 
1940s to the present, and the explosion in 
discoveries of fossil birds in China since the 
1990s. Such cases act as scene-setters in each 
chapter, and are supplemented by wonder- 
ful, less well-known examples. We meet, 
for instance, the wealthy Hungarian palae- 
ontologist Franz Nopcsa von Fels6-Szilvas 
(whose theories on early bird flight were 
influential), who in 1907 crossed Albanian 
rivers disguised as a shepherd and using an 
inflated goat’s bladder as a flotation device. 
There is a fine description of ornithologist 
Charles Sibley striding arrogantly into a 1986 
conference bearing a tapestry-sized poster 
illustrating his revolutionary new avian phy- 
logeny. The book is beautifully illustrated, 
and also contains charmingly candid auto- 
biographical sketches contributed by more 
than 20 of today’s leading ornithologists. 

How did ornithology come to have such 
a large impact on other areas? The answer 
seems to lie in the fact that young people with 
an interest in the natural world are attracted 
to observing birds, and the ease of studying 
them. With this in mind, two themes could 
have been developed further. The first is the 
importance of birdwatching in the early lives 
of many scientists who went on to excel in 
other areas, such as James Watson, co-dis- 
coverer of DNA’s double helix, and evolution- 
ary biologist Ernst Mayr. The second, which 
could fill another book, is the huge contribu- 
tion of non-professionals. From the pigeon 
fanciers who influenced Darwin to today’s 
army of digitally empowered citizen scientists 
collecting swathes of distributional and abun- 
dance data, amateur ornithologists have built 
the foundations of the modern science, and 
enabled its impact in so many other fields. m 


Ben Sheldon is Luc Hoffmann Professor 

of Field Ornithology and director of the 
Edward Grey Institute in the Department of 
Zoology at the University of Oxford, UK. 
e-mail: ben.sheldon@zoo.ox.ac.uk 
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OLIVER JONES 


Part of an instrument built with artist Oliver Jones to create sounds in Composing the Tinnitus Suites. 


Q&A Daniel Fishkin 
Tinnitus tunesmith 


Sound artist Daniel Fishkin tries to convey the expe 


rience of tinnitus. As the latest incarnation 


of his installation series Composing the Tinnitus Suites opens in Brooklyn, New York, he talks 


about building a mechanical model of the inner ear. 


How did your tinnitus begin? 

It started in 2008. I was 22, and it was the 
evening of my senior recital in music com- 
position at Bard College [in Annandale-on- 
Hudson, New York]. The concert was not 
loud, but afterwards I could hear a high- 
pitched sound. I had my ears tested over and 
over. Doctors told me to get used to it, but 
habituation is not always possible. People 


© 


with tinnitus can experience depression; 
some have been driven to suicide. 


What you have learned about the condition? 

The mechanics of tinnitus are still poorly 
understood. Mine is likely to be from noise 
exposure, but it also occurs as a result of 
skull fractures and brain tumours, and as a 
side effect of common drugs. No treatments 
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I was prescribed have 
helped. I haven't had 
access to experimen- 
tal treatments such as 
[repetitive] transcra- 
nial magnetic stimu- 
lation. Because there 
is currently no way to 
regrow damaged sen- 
sory hair cells in the 
human cochlea, there 
is no cure for me. 


Composing the 
Tinnitus Suites 
DANIEL FISHKIN 


Nothing Space, 
How didthis affect Broguyiy Now York 
your music? On 24, 25 and 
Itried to keep compos- 31 January and 


ing, butitwasimpossi- 1, 8 and 9 February. 
ble. So [began to study 

circuitry and engineering, and built a large 
sculpture strung with 6-metre-long piano 
strings wired in a feedback loop that made 
gorgeous long crescendos. I didn't realize until 
I showed the device to my girlfriend, an audi- 
tory neuroscientist who studies hearing loss, 
that I had made a giant mechanical model of 
the inner ear. It became the first installation in 
my ongoing project, Composing the Tinnitus 
Suites (see go.nature.com/xhvyu9). 


Tell me about your next installation in this 
project. 

This month I will create a new musical sculp- 
ture at the Nothing Space in Brooklyn. I will 
line the walls of the gallery with piano wire 
so that when the audience enters, they will 
be walking inside the instrument. There will 
be six performances that will integrate the 
sustained sounds of my instrument with 
woodwind and drums. 


Are you trying to reproduce your experience 
of tinnitus? 

My sculptures generate tones that drift 
slowly like those in my head, but because 
my tinnitus is not an acoustic phenomenon 
— it is a perceived sound rather than actual 
sound waves — I have found that it is impos- 
sible to reproduce the sound. For inspiration, 
Ihave tried to enhance the ringing in my ears 
by listening to loud tones for long periods 
of time, and consuming substances such as 
aspirin, caffeine and quinine, which are toxic 
to the ear. 


How will the project continue? 

The crux of the work is human agency. I 
want to connect people with tinnitus, give 
doctors a sense of what their patients are 
going through, and start a conversation 
between musicians and scientists. This piece 
will go on as long my ears ring. Instead of 
trying to get rid of the ringing, lam now ask- 
ing: what if, instead ofa curse, tinnitus is a 
kind of superpower? = 


INTERVIEW BY JASCHA HOFFMAN 
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Call for ecosystem 
modelling data 


We call on bioinformaticians, 
taxonomists and ecologists to 
collect, store and share new 
types of data for creating general 
ecosystem models (GEMs) 

that will eventually be used for 
predictive modelling of the 
biosphere (see G. Mace Nature 
503, 191-192; 2013). Funding 
bodies also need to recognize the 
importance of this work. 

Crucial requirements include: 
a database of functional traits 
for different species, which 
would allow modelling to 
take advantage of existing 
information associated with 
species-based databases; 
biotic data that indicate how 
organisms interact through 
space and time; and census 
data that quantify the number 
or weight of organisms in 
an ecosystem organized by 
functional traits. 

These comprehensive 
functional data will speed up 
the development of GEMs by 
enabling uncertainties to be 
reduced and predictions to be 
evaluated (see D. Purves et al. 
Nature 493, 295-297; 2013). 

We therefore appeal to 
taxonomists, who are usually 
concerned with morphological 
traits, to collect information on 
life histories and behavioural 
traits, especially with respect to 
species interactions. Ecologists 
can also contribute to these 
enhanced species descriptions 
by treating individuals as part 
of assemblages of the same 
species and as communities 
of different species, thereby 
providing valuable collective 
information. 

The technological capacity 
to store and share trait 
information is being developed 
by the Encyclopedia of Life 
(eol.org/traitbank). This can 
be scaled up to provide more- 
comprehensive data, such as 
those to describe interactions 
between organisms. 

Mike Harfoot United Nations 
Environment Programme 


World Conservation Monitoring 
Centre, and Microsoft Research, 
Cambridge, UK. 

mike. harfoot@unep-wemc.org. 
Dave Roberts Natural History 
Museum, London, UK. 
scratchpads@nhm.ac.uk 


Halt self-citation in 
impact measures 


We can improve the gender 
differences in science publishing 
and research (see V. Lariviére 
et al. Nature 504, 211-213; 2013) 
by making measurements of 
scientific output and impact 
fairer. 

For example, time spent 
on active research should be 
incorporated into assessments 
of research productivity. 
This would provide a fairer 
comparison for researchers 
who take parental leave or who 
have other caring duties or high 
teaching loads, and would reduce 
the pressure on those scientists. 

It would also be useful to 
halt the inclusion of author 
self-citations in measures of 
research impact, because self- 
citation is a male-biased practice 
(E. Z. Cameron et al. Trends 
Ecol. Evol. 28, 7-8; 2013). After 
all, genuine impact hinges on 
independent citation. 
Elissa Z. Cameron, Amy M. 
Edwards University of Tasmania, 
Hobart, Australia. 
elissa.cameron@utas.edu.au 
Angela M. White US Department 
of Agriculture Forest Service, 
Davis, California, USA. 


Himalayas already 
have hazard network 


Maharaj Pandit calls for the 
protection of the Himalayas 
through an international 
network to monitor 
environmental risks, develop 
early-warning systems to detect 
hazards and provide a better 
understanding of Himalayan 
geology and ecology (Nature 
501, 283; 2013). Such a network 
is in fact already in place, but 
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it needs more international 
support if it is to be properly 
effective. 

The International Centre 
for Integrated Mountain 
Development (ICIMOD) 
in Kathmandu, Nepal, was 
founded 30 years ago by the eight 
countries of the Hindu Kush- 
Himalayan region. ICIMOD’s 
expertise is now internationally 
recognized (S. Sarkar Himal. 
J. Sci. 45 7-8; 2007). 

ICIMOD forms a centre 
for intergovernmental 
knowledge and learning, as 
well as for regional research 
and development. It works 
for sustainable economic and 
environmental development of 
the Himalayan ecosystems, by 
monitoring risks from glacial 
lakes and providing early 
warning of hazards such as forest 
fires and flooding in many of its 
member countries. 

Furthermore, the formation 
ofa Himalayan University 
Consortium is strengthening 
collaboration between 
universities in the region. 
ICIMOD has been collaborating 
with governments, academics 
and non-governmental and 
community-based organizations 
from several Hindu Kush- 
Himalayan countries on 
conservation programmes to 
identify the most vulnerable 
transboundary landscape 
areas, which are also of global 
importance. 
Yadav Uprety, Ram P. 
Chaudhary Research Centre for 
Applied Science and Technology, 
Tribhuvan University, 
Kathmandu, Nepal. 
yuprety@yahoo.com 
Nakul Chettri JCIMOD, 
Kathmandu, Nepal. 


Avoid pitfalls of 
consensus methods 


We would like to clarify points 
raised in William Sutherland’s 
criticism of the treatment of 
pollinators in the UK National 
Ecosystem Assessment 
(Nature 503, 167; 2013). 


The estimated economic 
costs of pollinator decline are 
only as robust as the natural 
science on which they rest, 
as Sutherland indicates. If we 
could predict with certainty the 
effects of changes in pollinator 
populations on agricultural 
production, then evaluating 
them would be trivial. 

It was because of uncertainty 
in the underlying population 
ecology that we omitted 
estimates of pollination services 
from our economic analysis 
of the impacts of land-use 
change in our report, which was 
extensively peer-reviewed (see 
also I. J. Bateman et al. Science 
341, 45-50; 2013). 

The Delphi technique 
— a consensus method 
that Sutherland mentions 
for synthesizing research 
findings — can be helpful in 
some situations, but should 
be applied with caution to 
environmental valuation. The 
rapid expansion of empirical 
literature in this field means 
that conventional beliefs 
can rapidly become group- 
think norms, with dangerous 
consequences. 

For example, we rejected the 
popular consensus in favour of 
using survey techniques as a way 
of valuing biodiversity, choosing 
instead to estimate the costs of 
ensuring species conservation. 
We stand by our approach, 
which we believe conforms with 
Sutherland’s appeal for quality 
over quantity. 

Ian Bateman, Matthew 
Agarwala, Tomas Badura 
University of East Anglia, 
Norwich, UK. 
i.bateman@uea.ac.uk 
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FORUM: Developmental biology 
Tethered wings 


Wnt signalling molecules are thought to direct the development of an organism by spreading through tissues. But flies grow 
with almost normal appendages even when their main Wnt protein cannot move. Two scientists discuss the implications of 
this finding for our understanding of development. SEE ARTICLE P.180 


THE PAPER IN BRIEF 

@ The Drosophila (fruitfly) protein Wingless 
(Wg) is the prototype member of the 

Wnt family of proteins, which regulate tissue 
patterning and growth during development. 
@ Wg is thought to act as a morphogen — a 
protein that forms concentration gradients 
as it spreads from its site of synthesis 

and that regulates gene expression as a 


Non-essential 
Spread 


GINES MORATA 


orphogen regulation of target genes 

depends on the physical distance from 
the morphogen-secreting cell population, 
such that the levels of this molecule provide 
a genetic reading of position, a key issue in 
morphogenesis. The best examples of mor- 
phogens come from Drosophila: the secreted 
molecules Hedgehog, Decapentaplegic (Dpp) 
and Wingless (Wg) have been identified as 
morphogens’, and for Dpp and Wg there 
is compelling evidence that they act at long 
range”. It follows from the very definition of 
a morphogen that the spread of the molecule 
is an essential component of its function. But 
Alexandre and colleagues’ results suggest that 
this idea needs to be reconsidered. 

A clear demonstration of long-range action 
by Wg came from the finding’ that the protein 
activates target genes, including vestigial (vg) 
and Distal-less (Dil), in cells distant from Wg- 
secreting cells. By contrast, a Wg variant protein 
(Nrt-Wsg) that is functional but anchored to the 
cell membrane, through the addition of part of 
the transmembrane protein Neurotactin, was 
found to activate these target genes only in 
neighbouring cells. These original experiments 
were performed by artificial overexpression 
of Nrt-Wg; Alexandre et al. have now used 
sophisticated genome-editing technology to 
generate flies in which the wg gene encodes the 
Nrt-Ws¢ protein. The edited gene contains all 


function of its concentration. 

@ On page 180 of this issue’, Alexandre 

et al. describe wing formation in flies 
expressing a form of Wg that is tethered to 
the cell membrane, in place of the secreted 
protein. 

@ They observe normal wing morphology, 
although development is delayed and the final 
wings are smaller than those of normal flies. 


the normal regulatory sequences and is there- 
fore expressed normally, and the method seems 
to work with high efficiency, opening up the 
possibility of performing similar manipulations 
in other Drosophila genes of interest’. 
Considering the many functions of Wg 
during embryogenesis and during larval and 
adult life, and the essential role assigned to the 
protein's spread, any expert would have confi- 
dently predicted that a fly with only tethered 
Wg would not develop. But Alexandre and 
colleagues’ flies survive and are normal in 
appearance, although they grow more slowly 
than normal flies and their wings are smaller 
(Fig. 1). The authors examined the situation 
only in the wing disc, but the fact that the flies 
survive indicates that other Wg functions are 
more or less normal. The implication of their 
findings is that, at least for Drosophila, the 
long-range diffusion of Wg may be of minor 
significance — bringing into question the 
functional value of its role as a morphogen. 
How should these results be interpreted 
in light of the compelling evidence for long- 
range Wg action? Alexandre et al. confirm that 
Nrt-W¢ can induce Diland vg gene expression 
only in adjoining cells, so the almost-normal 
expression of these target genes in their mutant 
flies is hard to explain. The authors propose a 
‘cellular memory’ model, in which cells initially 
expressing target genes continue to express 
them even when they no longer receive the 
expression-inducing signal. That implies that 
Diland vg expression is perpetuated through 
cell divisions, but this is not supported by pub- 
lished evidence®. An alternative explanation is 
that, although the Nrt-W¢ protein is consid- 
ered to be functionally equivalent to Wg (except 
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for its diffusion), there might be undetected 
differences in its expression levels or stability. 

Despite the need for clarifying some aspects 
of these findings, the survival of flies that have 
only membrane-tethered Ws is telling, and 
the authors’ results call for a reassessment of 
how we think about W¢ function, and perhaps 
about that of other morphogens. Wg and Dpp 
are evolutionarily conserved in all animals, 
so their mode of function is likely to be con- 
served as well. The two proteins have acted as 
model morphogens, and understanding how 
they work is of major importance to biology. 
Wnt signalling is also of biomedical interest, 
because its misregulation is associated with 
human cancers and other diseases’. Although 
there is no question that these molecules have 
a crucial role in development and disease, re- 
examining how they work might change our 
picture of these processes. 


Ginés Morata is in the Centro de Biologia 
Molecular, CSIC-UAM, Universidad 
Autonoma de Madrid, Madrid 28049, Spain. 
e-mail: gmorata@cbm.uam.es 


Long-range 
thinking 


here is compelling evidence*** that Wg 

can, and normally does, act over many 
cell diameters to control gene expression and 
growth of the Drosophila wing. So the remark- 
able discovery that a membrane-tethered form 
of Wg can substitute for the normal protein 
poses the question: must morphogens move 
to organize development? When considering 
this challenge to how we think of morphogens, 
the devil is in the details. 

The main phase of Drosophila wing develop- 
ment begins with the induction of Wg expres- 
sion in all cells of the nascent wing and lasts 
for around two days, during which time the 
wing increases by about 50 times in size and 
cell number. On the first day, Wg is broadly 


Nrt-We- 
expressing cell 


Figure 1 | Wing development and Wingless spread. a, The Wingless protein (Wg) is thought to regulate the 
development of Drosophila wings by diffusing from Wg-secreting cells, thereby activating Wg target genes 
in distant cells as the wing grows. b, Alexandre et al.' show that, in flies expressing the Wg variant Nrt-Wg, 
which is tethered to the cell membrane and cannot diffuse, wings with normal morphology develop, although 
development is delayed and the wings are smaller than normal. Previous work’® has shown that Nrt-Wg 
can activate W¢ target genes in nearby cells but not in distant cells (green arrow), so it remains unclear how 
the long-distance W¢ signalling thought to be required for wing development is exerted in these flies. 


expressed, but its expression fades progres- 
sively in the more dorsally and ventrally posi- 
tioned cells, generating Wg gradients. These 
gradients might well suffice to initially direct 
normal gene expression and growth without 
requiring Wg to spread, and Alexandre and 
colleagues’ results support this idea. 

During the second day, Wg production 
becomes restricted to a central stripe of cells, 
but the protein continues to control gene 
expression and growth in cells up to 15-20 cell 
diameters away. The conventional view is that 
this is because they continue to receive Wg 
secreted by the central cells. Alexandre and col- 
leagues propose instead that, once prospective 
wing cells receive Wg, they acquire a long-term 
cellular memory of Wg exposure that controls 
the behaviour of their descendants there- 
after. According to this model, the descend- 
ants of cells exposed to Nrt-Wg should grow 
and express Wg target genes even when they 
are many cell diameters away from Nrt-Wg- 
expressing cells. But previous work*® shows that 
this is not so; rather, only those cells that remain 
close to Nrt-Wg-expressing cells continue to 
express W¢ target genes and grow. 

Without invoking a memory model, how 
might tethered Wg mimic the long-range 
action of the normal protein? One possible 
answer comes from the observation’ that 
Nrt-Wg accumulates in secreting cells to 
several times higher levels than normal Wg, 
indicating either that it is significantly more 
stable and/or that it provides a more potent 
signal to neighbouring cells because it is not 
attenuated by release and dispersion. Accord- 
ingly, Nrt-Wg that is expressed during the 
first day might persist and function adventi- 
tiously during the second day, providing a 
signal that would otherwise require spread 
of the protein. Another possibility is that Wg 


moves to some extent via cellular projections”; 
membrane-tethered Wg might also be able 
to do this, allowing it to influence cells at least 
a few cell diameters away. A third option is 
that the downstream effects of Wg signalling 
(including the function of proteins encoded by 
target genes) might persist for several hours, 
even after cells cease to receive Wg. All of these 
factors, some normal and others artefacts of the 
Nrt-Wg system, would extend the range over 
which Nrt-Wg can influence cell behaviour 
during the second day of wing development. 
Notably, Alexandre et al. find that the 
tethered protein fails to sustain normal 
patterns of gene expression or support normal 
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growth after its expression becomes restricted 
to the central stripe. This results in delayed 
wing development, and wings that never reach 
full size even when they have an extra day or 
longer to catch up. 

Thus, the new results do not falsify the 
interpretation of Wg as a classic morphogen 
in the Drosophila wing. Instead, they highlight 
that Wg acts at short range during early wing 
development but must act at long range at later 
times, as its production becomes restricted and 
the population of cells requiring Wg input 
expands. The Drosophila wing will continue 
to serve as a model for understanding how 
morphogens act to organize the development 
of larger tissues (such as butterfly wings and 
vertebrate limbs), and further studies using the 
methods introduced by Alexandre et al. will 
contribute to this understanding. m 


Gary Struhl is in the Department of Genetics 
and Development, Columbia University, 
New York, New York 10032, USA. 

e-mail: gs20@columbia.edu 
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energy storage 


Inflow batteries, energy is produced by passing solutions of ‘electroactive’ materials 
— often, metal salts — through an electrochemical cell. A non-metallic electroactive 
material opens the way to large-scale energy storage. SEE LETTER P.195 


GRIGORII L. SOLOVEICHIK 


he adoption of intermittently avail- 
able renewable energy sources, such as 
solar energy and wind power, to more 
than 20% of total energy capacity will require 
electric-energy storage systems to be deployed’. 
For grid-scale applications and remote gen- 
eration sites, cheap and flexible storage sys- 
tems are needed, but presently the options are 


either limited to a specific geographic location 
(such as pumping water from a reservoir to an 
elevated level as a source of potential energy) 
or expensive (for example, conventional bat- 
teries, flywheels and superconductive electro- 
magnetic storage)”. On page 195 of this issue, 
Huskinson etal.’ report a major advance in the 
development of economical energy storage: a 
‘flow’ battery that uses only water-soluble, non- 
metallic materials as the electrode components. 
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50 Years Ago 


The web-building of spiders 
provides a useful test of the 
activity of pharmacological agents. 
Particular substances affect 
web-building in particular ways 

... Unfortunately, the traditional 
procedure of producing discrete 
lesions in the central nervous 
system to observe behavioural 
consequences is particularly 
difficult in most arthropods 
because of the small size and 

rigid cuticle of the animals ... 
[H]ere, the intense and directional 
radiation of a laser beam was used 
to produce lesions ... Transient 
behavioural abnormalities 
occurred in 4 of the exposed 
spiders, apparently a consequence 
of damage to tissue with recovery 
capacities. Permanent disturbance 
occurred only after lesions in the 
cephalothorax of spiders, as the 
presumed consequence of damage 
to tissue incapable of regeneration, 
such as nerve tissue. Histological 
analysis appears to confirm 
central nervous system lesions ... 
[A] means seems to be at hand 

for objectively relating spider 

(and other arthropod) behaviour 
to the results of histological 
analysis. 

From Nature 11 January 1964 


100 Years Ago 


‘Lucretius or Kapteyn?’ — Nonne 
vides etiam diversis nubila ventis 
diversas ire in partis inferna 
supernis? Qui minus illa queant per 
magnos aetheris orbis aestibus inter 
se diversis sidera ferri? 

De Rerum Natura, v., 646-9. 

See you not too that clouds from 
contrary winds pass in contrary 
directions, the upper ina way 
contrary to the lower? Why may not 
yon stars just as well be borne on 
through their great orbits in ether 
by currents contrary one to the 
other? Munro’ Translation E.J.M. 
From Nature 8 January 1914 


Membrane 


Figure 1 | Schematic of a flow battery. In this set-up, two solutions of electroactive materials (green and 
purple) are stored in external tanks and pumped to flow-through electrodes in an electrochemical cell. 
The materials undergo reactions at the electrodes, generating electricity when a load is connected. 

A membrane between the electrodes prevents the solutions from mixing, but allows the transport of 
charge-carrying ions, thus allowing electrical neutrality in the system to be maintained. In charging 
mode, the power source generates a potential difference across the cell. Huskinson et al.’ report that 
organic compounds known as quinones can be used as the electroactive material at the anode of a flow 


battery. (Schematic adapted from ref. 8.) 


These materials could lower the cost of flow 
batteries, while increasing the energy density. 

Flow batteries require two soluble electro- 
active components — compounds that take 
part in an electrochemical reaction at an elec- 
trode. These components are separated by an 
ion-conducting membrane in an electrochem- 
ical cell, in which chemical energy is converted 
to electricity (and vice versa). In contrast to 
the stationary electroactive materials of con- 
ventional batteries, the electroactive compo- 
nents in flow batteries are pumped through the 
cell in a flow of liquid, and are stored outside 
the cell in separate tanks (Fig. 1). This design 
allows for individual optimization of the 
amount of energy stored (which is controlled 
by the size of the storage tanks) and the power 
generated (which depends on the size of the 
electrochemical cell or stack of cells). 

Because the electroactive materials in flow 
batteries are stored separately, the possibility 
that they will react violently with each other is 
almost completely eliminated, making these 
devices much safer than conventional bat- 
teries. They also have more flexible layouts 
and are potentially cheaper. Unfortunately, 
the choice of electroactive materials for flow 
batteries is limited to a small selection of 
metal redox systems (with a few exceptions 
for cathode materials), and by the low solubil- 
ity of these metal salts — typically, in water. 
The solubility problem prevents high energy 
densities from being achieved". 

Huskinson et al. overcome the solubility 
problem by using as the electroactive compo- 
nents soluble, organic, redox-active materials 
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known as quinones in place of metals. Water- 
insoluble quinones were proposed’ as elec- 
trode materials in 1972, but the use of this 
class of compound as the energy-storage com- 
ponents of a flow battery is new. The authors 
found that the chemical reduction of their qui- 
nones to form hydroquinones in water at an 
electrode is very fast, which is a prerequisite 
for high-power battery discharge. 

Redox potentials and the solubilities of 
metal complexes can be tuned by modifying 
the ligands bound to the metal atoms. With 
quinones, these properties can be modified 
by changing the chemical groups attached to 
the aromatic rings of the molecules. This offers 
much wider scope for modification than is 
possible for metal systems, because the chemi- 
cal groups are closer to the redox centre than 
ligands are to metals, and so their effect is more 
pronounced. In addition, having negatively 
charged electroactive species — quinones to 
which negatively charged groups are attached, 
such as those used by Huskinson et al. — 
should help to reduce one of the major practical 
problems associated with flow batteries, namely 
the crossover of these materials through the 
negatively charged ion-conducting mem- 
brane. However, this approach has a downside, 
because any improvements in electrochemical 
properties and/or solubility will be associated 
with an increase in the molecular weight of the 
electroactive species, and will therefore reduce 
the energy density of the battery. 

Huskinson and co-workers coupled the 
liquid quinone/hydroquinone system at the 
anode of their battery with a bromine/bromide 


system at the cathode. This cathode system has 
previously been used in a zinc-bromine flow 
battery® and in a hydrogen—bromine regen- 
erative fuel cell’ (a variant of a flow battery). 
The bromine/bromide cathode provides good 
energy density at a reasonable cost, although 
it is corrosive and environmentally unfriendly. 
When the authors tested a small version 
(2 square centimetres) of their flow battery, 
they found that it gave a respectable power 
density (600 milliwatts per square centimetre) 
and good current efficiency (the efficiency 
with which charge is transferred to allow a 
targeted electrochemical reaction to occur). 

The new findings open the way to inexpen- 
sive energy storage, but there is a long way to 
go to develop a practically useful flow battery. 
In particular, several issues must be addressed 
before this chemistry can be used in grid-scale 
energy storage. The authors studied only qui- 
none reduction, so the reverse reaction — the 
oxidation of hydroquinones — should also be 
investigated. If the reverse reaction is as fast 
as quinone reduction, then quinones could 
potentially be used in high-power devices. 
The effect of the electroactive-species con- 
centration, and of impurities in the quinones, 
on the cell’s performance and ability to be 
used through many charge-discharge cycles 
must be evaluated. If high-purity quinones are 
needed, it could noticeably increase the cost. 

Bromine crossover through the membrane 
should also be considered seriously. Even if 
bromine does not react with compounds in 
the anode system, such crossover will reduce 
battery capacity and energy efficiency (the 
ratio of electrical-energy output to input), 
which should be measured as a function of 
cycle number. Scaling up from a small single 
cell to an industrial-sized, multi-cell stack 
may be challenging, and integrating the vari- 
ous components of a large-scale device into a 
working battery might also be difficult. For sta- 
tionary energy storage, a long life (more than 
10,000 cycles) is key to keeping costs down, 
so the number of cycles demonstrated in the 
paper (15) is far from that needed. 

Nevertheless, Huskinson and colleagues’ 
results are promising, and may serve as the 
basis for a new flow-battery technology. Iflong- 
term capacity and energy-efficiency retention 
can be demonstrated, and if practically useful 
batteries can indeed be prepared cheaply, then 
this technology will be suitable for a wide array 
of energy-storage applications. m 
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Detective work 
on drug dosage 


Patients differ in their requirement for, and response to, various drug doses. A 
general platform that allows continuous monitoring of drug levels in the blood 
of rats may open the door to patient-specific dosing. 


RICHARD M. CROOKS 


hoosing the right drug dose for a par- 

ticular patient is more of an art than 

a science. For example, the dosage of 
most drugs is simply based on patients’ age: 
“Adults and children 12 years and over, take 
2 tablespoons every 6 hours,” for example. In 
reality, there is patient-to-patient variability 
in drug metabolism and excretion, highlight- 
ing the need for accurate and patient-specific 
approaches to monitor drug concentration 
after administration. Taking a big step towards 
this goal, Ferguson et al.' describe, ina paper 
published in Science Translational Medicine, a 
detection system that allows real-time tracking 
of drugs in the blood. 

For a few relatively toxic drugs, several 
factors, including gender, body mass or body 
surface area, are taken into account to better 
estimate the effective dose. For drugs with 
a particularly narrow window between the 


Aptamer 
probes 
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minimum effective and toxic doses, clinicians 
often opt for ‘peaks and troughs’ measure- 
ments. In this approach, blood is drawn half 
an hour after a drug dose is given, when the 
drug's concentration in the blood is likely to be 
highest; a second sample is then drawn imme- 
diately before the next dose is due. From these 
two isolated data points the drug’s pharma- 
cokinetics (the rate at which its concentration 
in the blood falls) is inferred, and from that the 
optimal dosing regimen for that specific patient 
is determined. 

Even the most advanced methods used to 
estimate an appropriate drug dose are rather 
crude and imprecise. Existing measure- 
ments of pharmacokinetics typically involve 
drawing blood and sending it to a central 
lab for analysis. The ability to monitor drug 
levels in the blood continuously in real time 
in the clinic would vastly improve the preci- 
sion of such measurements. This, in turn, 
would greatly improve the ability to tailor 


Electrode 


Microfluidic 
channel 


Figure 1 | MEDIC inaction. Ferguson et al.' describe a detector they call MEDIC, which could 
potentially be used as follows. A patient's blood enters the multilayered device, where it flows along a 
microfluidic channel and encounters, but does not mix with, a separate stream of saline buffer. The buffer 
layer acts as a continuous flow filter: small-molecule drugs rapidly diffuse to the surface, whereas more 
slowly diffusing large proteins and blood cells cannot reach it. Aptamer probes, which are immobilized on 
multiple electrodes along the top of the channel, act as molecular switches: on binding to a specific drug, 
they change shape and induce an electrical current. The current is directly related to the concentration of 
the target drug, allowing accurate quantification. (Figure adapted from ref. 1.) 
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drug dosing to a patient's specific needs. 

Measuring drug levels continuously requires 
a technology that is reversible, so that the sen- 
sor’s response rises and falls in concert with 
fluctuating drug concentrations. Moreover, the 
technology must be continuous, of course, and 
so should not rely on wash steps or other batch 
processes. Finally, it must be sufficiently selec- 
tive to be used on whole blood. Unfortunately, 
although conventional analytical methods, 
including chromatography, spectroscopy and 
immunochemistry, often have one or more 
of these attributes, no general approach has 
achieved all these goals simultaneously. There 
are ways to measure a few specific molecules in 
the body in real time (for instance, blood glu- 
cose levels in patients with diabetes), but these 
are single-analyte sensors that are not easily gen- 
eralizable to the detection of other molecules. 

Ferguson et al. describe a sensor that clev- 
erly links the above three technologies; they 
call it microfluidic electrochemical detector 
for in vivo continuous monitoring (MEDIC). 

The sensing technology underlying this 
platform is a reagent-free electrochemical 
device** that uses the binding-induced fold- 
ing of aptamers’ (artificially selected nucleic 
acids that bind specific molecular targets) to 
signal the presence of a given analyte. This 
reagent-free, wash-free, sensing architecture 
has previously been shown’ by some of the 
same authors to support continuous meas- 
urements in flowing, undiluted blood serum. 
The approach fails, however, when the sensor 
is challenged with whole blood, owing to the 
nonspecific adsorption of molecules onto the 
electrode surface, which progressively deacti- 
vates the associated aptamers — thereby lead- 
ing to baseline drift in the output signal. 

To eliminate this drift, Ferguson and col- 
leagues took a two-pronged approach. The 
first was to place the sensors in a microfluidic 
device that insulates them with a micrometres- 
thick stream of buffer. Blood continuously col- 
lected from the subject (by a cannula) is drawn 
into the device, where it forms a laminar flow 
over this buffer (Fig. 1). Because the drug mol- 
ecules are small, they quickly diffuse through 
the buffer layer to reach the sensor surface. 
The much larger blood cells and other large 
interfering agents diffuse too slowly to reach 
the buffer stream, so sensor fouling is essen- 
tially eliminated. 

The authors’ second advance was to interro- 
gate their electrochemical aptamer probes 
using a method, known as square-wave volt- 
ammetry, operating at two discrete frequen- 
cies. Specifically, they identified matched 
frequency pairs at which the output signal 
drifts in concert while responding very dif- 
ferently to the presence of the target. Tak- 
ing the difference between these two signals 
effectively eliminates drift. Combining the two 
approaches, the authors’ device achieves multi- 
hour, continuous measurements on whole, 
undiluted blood with baseline stabilities in the 


submicromolar range of drug concentration. 

The team demonstrated the ability of the 
MEDIC platform to monitor the chemothera- 
peutic drug doxorubicin and the antibiotic kan- 
amycin in the blood of anaesthetized rats over 
the course of several hours. The pharmaco- 
kinetics derived correspond to long-established 
values’ obtained by laboriously drawing blood 
samples and then, much later, measuring each 
by using chromatography. In the present paper, 
by contrast, the measurements were made in 
real time, which not only is convenient but also 
improves their precision. 

There are some disadvantages to this plat- 
form, however. MEDIC requires continuous 
blood draws (of just a few hundred micro- 
litres per hour) and a pump to maintain the 
flow of buffer through the device. It is therefore 
unsuitable for continuous, real-time monitor- 
ing of metabolites or drugs in the blood of a 
mobile patient going about their daily life. 
Nevertheless, by enabling convenient, high- 
precision measurements of pharmacokinet- 
ics in the clinic, the technology could fuel 
further advances in personalized medicine 
by supporting truly individualized dosing 
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regimens. Indeed, the ability to monitor blood 
drug concentrations in real time could pave 
the way to proactive, high-precision dosing in 
which drug delivery is modulated on the go in 
response to hour-to-hour changes ina patient's 
metabolism or health status. Such feedback- 
controlled drug delivery could, in turn, open 
the door to therapies in which drugs with pre- 
viously unduly complex dosing regimens or 
unacceptably narrow therapeutic indices are 
administered safely and effectively. m 
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An atomic SQUID 


Superconducting quantum circuits are the core technology behind the most 
sensitive magnetometers. An analogous device has now been implemented using 
a gas of ultracold atoms, with possible applications for rotation sensing. 


CHARLES A. SACKETT 


hen a magnetic field needs to be 
measured with the utmost preci- 

sion, a superconducting quantum 
interference device (SQUID) is the instru- 
ment of choice’. Its exquisite sensitivity derives 
directly from a macroscopic manifestation of 
quantum mechanics, making it an archetype 
of quantum engineering. Reporting in Physi- 
cal Review Letters, Ryu and colleagues” dem- 
onstrate an analogue of a SQUID using an 
ultracold gas of neutral atoms known as a Bose- 
Einstein condensate. Here, the analogue to the 
magnetic field is a physical rotation, so the 
atomic device could prove useful for rotation 
sensing and vehicle navigation. More broadly, 
it strengthens the correspondence between 
atomic and solid-state systems. Because atomic 
systems are better understood and more easily 
controlled than their solid-state counterparts, 
atoms might eventually serve as a design plat- 
form for complex solid-state quantum devices. 
A conventional SQUID is a small ring of 
superconducting material cut in half by two 
non-superconducting barriers. Wire leads con- 
nected to each side of the device allow a current 
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to pass through it (Fig. 1a). Within each of the 
superconducting regions, electrons act like a 
coherent quantum wave. Because the current 
passing through the SQUID can take either 
path around the ring, the two corresponding 
waves can interfere: they can add constructively 
with the peaks of the waves lined up, or cancel 
destructively with the peaks of one wave aligned 
to the troughs of the other. The total current 
through the ring depends sensitively on the 
type of interference. For charged particles such 
as electrons, the way that the waves align is set 
largely by the magnetic field threading the ring, 
which makes the SQUID a good magnetometer. 

In the atomic analogue demonstrated by Ryu 
and colleagues, the superconducting electrons 
are replaced by a Bose-Einstein condensate 
consisting of a few thousand rubidium atoms 
at nanokelvin temperature, isolated in an ultra- 
high-vacuum chamber’. Like the electrons, the 
atoms in a Bose-Einstein condensate act as a 
wave, allowing similar physics to be probed. 
Here, the atoms are held in a ring-shaped trap 
that has two small potential-energy barriers 
through which the atoms can tunnel (Fig. 1b). 
The authors created the ring trap using a 
technique known as a painted potential. For 


Figure 1 | Quantum interference devices. a, A conventional superconducting device consists of a ring 
of superconducting wire split by two non-superconducting barriers (blue). The current (thick black lines) 
through the loop must tunnel through the barriers (thinner lines). b, The atomic version demonstrated 
by Ryu et al.’ is a Bose-Einstein condensate (red) held in a ring-shaped trap. The ring is broken by two 
potential-energy barriers. Instead of passing the atoms through the barriers, here one of the barriers is 
moved around the ring so as to pass through the atoms, as indicated by the arrow. The distribution of 

the atoms between the two regions reveals the dynamics of atom motion, which corresponds well to the 


electron currents in the superconductor. 


this, a laser beam is rapidly scanned across the 
trapping region and selectively turned on so 
as to illuminate only a ring-shaped area. The 
atoms are attracted to the laser light and con- 
fined within the ring. The tunnelling areas are 
produced by reducing the light intensity at two 
spots on the ring. 

Because the atoms are neutral, the con- 
densate version is not especially sensitive to 
magnetic fields. However, if the whole appa- 
ratus rotates, then the atoms will experience 
the Coriolis force, which twists the path of 
any object moving on a rotating platform. 
For example, on the Earth, the Coriolis force 
causes the circulating air flow of hurricanes 
and cyclones. It affects the atomic waves much 
like a magnetic field affects electrons. By meas- 
uring how the atoms move through the ring, 
even a tiny Coriolis force can be detected, 
making the system useful for sensing rotation’. 

Ryu and colleagues’ work builds on previ- 
ous demonstrations (see, for example, ref. 4) 
of atomic systems similar to a SQUID, but for 
the first time uses the complete geometry of a 
ring with two barriers. The observed behav- 
iour of the authors’ atoms is in good accord 
with the phenomenological model used to 
describe superconducting devices. Indeed, for 
the atomic case, the expected behaviour can 
be derived nearly from first principles, so the 
system is on firm theoretical ground. 

Rotation sensors are useful for vehicle 
navigation and other geophysical applica- 
tions, and the atomic SQUID shows promise 
for advancing these technologies. A greater 
impact, however, may derive from the dem- 
onstration of how atomic systems can replicate 
solid-state devices. Although solid-state cir- 
cuits have the practical benefit of not requir- 
ing lasers and vacuum chambers, developing 
a new device involves painstaking fabrication 
and characterization work. By contrast, the 
size and shape of an atom trap can be modi- 
fied simply by reprogramming the behaviour 
of the laser beam. The atom system could 
thus serve as a design tool for complicated 


circuits, in which the geometry could be dev- 
eloped and optimized before being applied 
to superconductors. Although ordinary 
computer simulation can serve a similar 
purpose, a physical device can easily become 
too complex for simulation to be practical. 
Such complexity is common in quantum 
systems in which particle interactions are 
important and non-trivial, including high- 
temperature superconductors and, perhaps 
one day, quantum computers. 

The idea that ultracold atoms could be used 
to simulate and explain solid-state systems 
has been a driving force in the atomic-physics 
community since the first observations of 
Bose-Einstein condensation’. Ryu and col- 
leagues’ demonstration that a useful device 
such as a SQUID can be implemented with 
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atoms is a milestone in this effort. Nonetheless, 
substantial challenges remain. An immediate 
issue is the difficulty of using atomic systems 
to model macroscopic currents: the number 
of atoms in a condensate is relatively small, so 
there is no simple way to create a large cur- 
rent. The authors sidestep this problem by 
measuring a small current flowing through 
a barrier, rather than a large current passing 
through the ring as a whole (compare Fig. la 
and b). Although this set-up can be used for 
rotation measurements, it does not reflect the 
actual operation of a superconducting SQUID. 
A larger question is how well the correspond- 
ence between atoms and solid-state systems 
will hold up as the system's complexity grows. 
Until the systems become too complicated for 
computer simulation, the utility of the atomic 
experiments as a design platform for solid- 
state systems will probably be limited. Meeting 
the challenges involved will not be easy, but the 
steady progress in this field exemplified by Ryu 
and colleagues’ achievement is encouraging. = 
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Ringside views 


Two crystal structures reveal that the Vif and Vpx proteins of human and simian 
immunodeficiency viruses mediate evasion of host defences by reprogramming 
the cellular protein- degradation machinery. SEE LETTERS P.229 & p.234 


MICHAEL H. MALIM 


he human immune system uses myriad 

adaptive and innate mechanisms to 

fight HIV infection and AIDS. Promi- 
nent among these is a collection of widely 
expressed cellular proteins called restriction 
factors, which can potently suppress viral 
replication’. But human and simian immuno- 
deficiency viruses, including HIV-1, encode 
several dedicated regulatory proteins that 
enable them to evade restriction factors, 
thus ensuring their survival and propaga- 
tion. Two papers” in this issue show how 
the viral Vif and Vpx regulatory proteins 


bind to key host-cell partners and targets, 
culminating in the removal of restriction 
factors from infected cells. 

The substrates for Vif are members of the 
APOBEC3 (A3) protein family, namely A3D, 
A3F, A3G and A3H; the substrate for Vpx is the 
SAMHD1 protein. These are all enzymes that 
interfere with reverse transcription, an essential 
phase of HIV replication in which the viral RNA 
genome is copied into DNA. The APOBEC3 
proteins are cytidine deaminases that are cap- 
tured by virus particles as they assemble. The 
proteins induce destructive hypermutation of 
nascent viral DNA and suppress its synthesis’. 
SAMHD1 is a deoxynucleoside triphosphate 
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Figure 1 | Destruction of host-cell antiviral proteins. a, Guo and colleagues’ crystal structure’ shows 
the HIV-1 Vif protein occupying a central position as a substrate receptor in the CRL5 complex formed 
between host-cell proteins CBF-B, ELOB-ELOC, CULS, an E2 enzyme and RBX2. The substrate, an 
APOBEC3 protein, is recruited by Vif and marked by ubiquitin molecules (Ub), which are transferred 
from E2 by RBX2. This tags the substrate for degradation by the cell. b, Schwefel et al.’ show that the 
Vpx protein from sooty mangabey simian immunodeficiency virus occupies a comparatively peripheral 
position in the complex it forms with host-cell proteins. Vpx binds DCAF1 and recruits the substrate 
SAMHD1; DDB1 has an analogous role to ELOB-ELOC. The proteins interact with CUL4A, RBX1 and 
an E2 enzyme to form a CRL4A complex, and SAMHD1 is tagged for destruction by ubiquitination. 


triphosphohydrolase that reduces the cellular 
levels of dNTPs (the substrates for DNA 
synthesis), thereby suppressing reverse trans- 
cription, especially in non-proliferating cells 
such as myeloid cells and resting T cells”. 

Vif and Vpx both work by directly recruit- 
ing their target restriction factors to host-cell 
cullin-RING ubiquitin ligases (CRLs), a diverse 
family of multicomponent enzymes that add 
ubiquitin chains to substrates’, thereby mark- 
ing them for destruction by the proteasome (a 
cellular protein-degrading machine). The CRLs 
are assembled with a cullin (CUL) protein (of 
which there are six in humans) as the central 
scaffold. Their catalytic core is built around the 
carboxy terminus ofthe CUL and also contains 
an RBX (or ROC) RING-finger protein and an 
associated E2 ubiquitin-conjugating enzyme 
(Fig. 1). The amino-terminal region is devoted 
to substrate recruitment: typically, the CUL 
binds a substrate adapter molecule, which in 
turn connects to a substrate receptor and its 
bound substrate. Now, Guo et al.’ and Schwefel 
etal.’ present the contrasting mechanisms used 
by Vif and Vpx to engage CRLs. 

Previous attempts to resolve the structure 
of the Vif protein, either alone or in asso- 
ciation with its CUL5-based CRL (CRL5) or 
APOBEC3 substrates, had been unsuccess- 
ful. A game-changing advance came with the 
discovery®” that the transcription factor 
CBF-f is also required for the function of 
HIV-1 Vif. It rapidly emerged that Vif and 
CBF-6 form a stable heterodimer, and this 
advance enabled the purification of enough 


soluble full-length Vif for structural studies. 
Guo et al. (page 229) describe the first crystal 
structure ofa complex comprising Vif-CBF-B, 
the ELOB-ELOC heterodimeric substrate 
adapter and an amino-terminal fragment of 
CULS (Fig. 1a). The structure shows that Vif 
occupies a crucial nucleating position within 
this pentameric complex, simultaneously 
interacting with CBF-B, CUL5 and ELOC, and 
promoting CRL assembly. By contrast, CBF-B 
contacts only Vifand seems to serve a chaper- 
one-like function by helping Vif to fold into an 
active conformation. Interestingly, the contacts 
that Vif makes with ELOC (through an evolu- 
tionarily conserved peptide sequence called the 
BC box) and CULS imitate those made by the 
cellular protein SOCS2, a CRL substrate recep- 
tor involved in the downregulation of growth- 
hormone signalling. This suggests that the two 
proteins have adopted a similar mechanism 
for CRL5 recruitment. A fascinating future 
step would be to add APOBEC3 proteins to 
the complex; as discussed by the authors, the 
amino-acid residues in Vif that are required for 
engaging A3F or A3G are solvent accessible, 
and are therefore predicted to be available for 
direct interactions with these substrates. 
Interest in Vpx intensified following the 
discovery that it provokes the degradation 
of SAMHD1 during the early stages of virus 
infection. HIV-2 and diverse simian immuno- 
deficiency viruses (SIVs) encode Vpx or Vpr 
proteins with this function’. Interestingly, 
HIV-1 does not, raising important questions of 
whether and howit evades SAMHD1-mediated 
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restriction. Building on earlier structural stud- 
ies from their group’, Schwefel et al. (page 234) 
now present the crystal structure of a complex 
between the Vpx-binding element of SAMHDI1, 
the carboxy-terminal WD40 domain of the 
CRL4A substrate receptor DCAF1, and the Vpx 
protein of the SIV that infects sooty mangabeys 
(Fig. 1b). The structure shows that all three 
components contact each other, with extensive 
interactions between Vpx and DCAFI creat- 
ing a shared surface to which SAMHD1 binds. 
Although structures of Vpx-containing com- 
plexes with additional CRL components are 
eagerly anticipated, the authors proceeded to 
model a CRL-Vpx-SAMHD1 complex using 
previously determined structures. Satisfyingly, 
the model shows SAMHD1 positioned close 
to the RING domain of RBX1, a location that 
would be expected to render it receptive to 
ubiquitination. 

Thus, these new reports show that although 
Vifand Vpx both manipulate CRLs to recognize 
host antiviral proteins and trigger their ubiqui- 
tination, they achieve this through contrasting 
mechanisms: Vif occupies a central organizing 
position and acts as a substrate receptor, whereas 
Vpx operates more peripherally to remodel the 
substrate receptor and facilitate substrate bind- 
ing. These papers highlight not only the remark- 
able structural flexibility of assembled CRLs, but 
also the unrelenting capacity of viral proteins to 
commandeer cellular pathways for the benefit 
of the virus. Visualizing such structures and 
their underpinning protein interactions at 
atomic-level detail should inspire rational 
drug-design efforts aimed at interfering with 
fundamental aspects of CRL function. Pharma- 
cological interventions that spare restric- 
tion factors from virus-induced elimination 
may offer a further therapeutic approach for 
treating HIV infections. = 
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CORRECTION 

The News & Views article ‘Cell biology: The 
beginning of the end’ by Judith Campisi 
(Nature 505, 35-36; 2014) omitted the name 
of the first author of ref. 5, Mufoz-Espin, in 
the final sentence of the first paragraph. The 
online versions of the article are correct. 
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Diversity of ageing across the tree of life 


Owen R. Jones'**, Alexander Scheuerlein**, Roberto Salguero-Gomez*"*, Carlo Giovanni Camarda®, Ralf Schaible’, 
Brenda B. Casper®, Johan P. Dahlgren’?, Johan Ehrlén’, Maria B. Garcia’, Eric S. Menges’, Pedro F. Quintana-Ascencio"®, 


Hal Caswell?*"”, Annette Baudisch? & James W. Vaupel>?? 


Evolution drives, and is driven by, demography. A genotype moulds its phenotype’s age patterns of mortality and fertil- 
ity in an environment; these two patterns in turn determine the genotype’s fitness in that environment. Hence, to 
understand the evolution of ageing, age patterns of mortality and reproduction need to be compared for species across 
the tree of life. However, few studies have done so and only for a limited range of taxa. Here we contrast standardized 
patterns over age for 11 mammals, 12 other vertebrates, 10 invertebrates, 12 vascular plants and a green alga. Although it 
has been predicted that evolution should inevitably lead to increasing mortality and declining fertility with age after 
maturity, there is great variation among these species, including increasing, constant, decreasing, humped and bowed 
trajectories for both long- and short-lived species. This diversity challenges theoreticians to develop broader perspec- 
tives on the evolution of ageing and empiricists to study the demography of more species. 


To examine demographic age trajectories across the tree of life, we 
studied life tables (that is, patterns of mortality and fertility over age) 
and population projection matrices” for multicellular species from a 
wide range of taxonomic groups (Fig. 1; see Supplementary Methods 
for data sources and further rationale). We strived to find species with 
reliable data and from diverse taxa. From the data for each species we 
estimated smoothed trajectories of fertility, mortality and survivorship 
over age. Further research will undoubtedly refine the curves shown for 
many of the species in Fig. 1 and reveal variation in different environ- 
ments and for different genotypes, but the general patterns are, we 
believe, serviceably accurate. 

We standardized the demographic trajectories to facilitate compa- 
rison. Specifically we standardized the age axis so that it starts at the 
mean age of reproductive maturity and ends at a terminal age when 
only 5% of adults are still alive. After this terminal age, sample sizes 
were usually small and determination of age was often problematic. 
Fertility and mortality were mean-standardized by dividing age-specific 
fertility and mortality by the respective weighted average levels of fer- 
tility and mortality for all adults alive from maturity to the terminal 
age (see Methods). We refer to these standardized values as relative 
fertility and relative mortality. From the highest level of relative mor- 
tality at the terminal age (Fig. 1, top left) to the lowest level (Fig. 1, bottom 
right), species are ordered sequentially, row-by-row and from left-to- 
right. For the 46 diverse species depicted here, the range of variation in 
trajectories of fertility and mortality is unexpected. As an indication of 
variability across species, in modern Japanese women (Fig. 1, top left), 
mortality at the terminal age (102 years) is more than 20 times higher 
than the average level of adult mortality, whereas for white mangrove 
(Avicennia marina; Fig. 1, bottom right) the level of mortality at 123 
years is less than half the average adult value. 

Such variability is not predicted by the standard evolutionary theories 
of ageing’ °. Such theories provide explanations solely for age patterns 


of increasing mortality and decreasing fertility from maturity; the dis- 
posable soma theory® does so for species that segregate the germ line 
from the soma. Furthermore, for those species that show a lifetime 
increase in mortality, the canonical theory cannot account for the diffe- 
rent magnitudes of that increase, although the disposable soma theory 
points to the crucial importance of trade-offs between the allocation of 
limited resources to repair and maintenance versus fertility and other 
imperatives. 


Mortality 


The most notable pattern is the mortality trajectory for post-industrial 
humans, exemplified by Japanese women in 2009. The steep rise in rela- 
tive mortality for the Japanese women is extreme compared with the 
rise for other species and sharper than that for historical populations 
such as the Swedish cohort born in 1881 and for hunter-gatherers such 
as the Aché of Paraguay whose mortality experience may be typical of 
humans over most of human existence’’. The increased steepness of 
the rise of human mortality has largely occurred over the past century, 
indicating that it was behavioural and environmental change (includ- 
ing advances in health care) and not genetic change that moulded the 
current pattern’ °. Our close relatives, chimpanzees (Pan troglodytes) 
and baboons (Papio cynocephalus) also show a rise in mortality with 
age but far less than that for hunter-gatherers. 

In several species mortality declines with age (Fig. 1, bottom row) 
and, in some cases, notably for the desert tortoise (Gopherus agassizii), 
the decline persists up to the terminal age. In other cases, an initial 
decline is followed by more or less constant mortality (for example, 
netleaf oak, Quercus rugosa). For species for which the underlying data 
are based on stages, such as dwarf gorse (Ulex minor) or the red-legged 
frog (Rana aurora), an asymptote is inevitable at older ages*”°. To alert 
readers to this, the mortality (and fertility and survival) curves derived 
from stage-classified models are represented by dashed curves in Fig. 1 
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Figure 1 | Demographic trajectories. Relative mortality (red) and fertility terminal age. Survivorship (on a log scale) from maturity is depicted by the 
(blue) as functions of age, from maturity to the age when only 5% of the shaded areas. Broken lines, for trajectories derived from projection matrices, 
adult population is still alive; mortality and fertility are scaled relative to their start at the age when cohorts have converged to within 5% of their 

means. Subplots are arranged in order of decreasing relative mortality at the quasi-stationary distribution (see also Supplementary Methods). 
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at ages beyond which a cohort will have converged to within 5% of the 
quasi-stationary distribution (see Methods). 

For most species in Fig. 1 the age pattern of mortality is derived from 
data on ages rather than stages. For some of these species, mortality 
levels off at advanced ages (for example, for the collared flycatcher, 
Ficedula albicollis, the great tit, Parus major, the fruitfly, Drosophila 
melanogaster) and in others remains constant at all adult ages (for 
example, for Hydra magnipapillata). For hydra in the laboratory, this 
risk is so small that we estimate that 5% of adults would still be alive 
after 1,400 years under those controlled conditions. 


Fertility 

The fertility trajectories show considerable variation. For humans the 
trajectories are bell-shaped and concentrated at younger adult ages, but 
other shapes are apparent in Fig. 1. The patterns for killer whales 
(Orcinus orca), chimpanzees, chamois (Rupicapra rupicapra) and spar- 
rowhawks (Accipiter nisus) are also approximately bell-shaped but 
spread over more of the course of life. Other species show trajectories 
of gradually increasing fertility (for example, southern fulmars, and 
the agave, Agave marmorata), asymptotic fertility (for example, tundra 
voles, Microtus oeconomus), or constant fertility (for example, hydra). In 
addition to humans and killer whales, bdelloid rotifers (Macrotrachela 
sp.), nematode worms (Caenorhabditis elegans) and Bali mynah birds 
(Leucopsar rothschildi) have post-reproductive life spans, which lends 
further support to the idea that this phenomenon may be widespread**"’. 


Axes of senescence 


Although the demographic trajectories in Fig. 1 vary widely, most of 
the 46 species can be roughly classified along a continuum of sene- 
scence; running from strong deterioration with age, to negligible dete- 
rioration, to negative senescence’? and improvement with age. However, 
there are some deviations, for example, for Soay sheep (Ovis aries) and 
dwarf gorse, which show mortality reductions with adult age followed 
by deterioration. Fertility patterns show similar diversity. 

A fast-slow continuum has been proposed to order species from 
those with short lives and intense early reproduction to those with long 
lives and an extended reproductive period’*""’. Figure 1 displays mor- 
tality and fertility over the adult lifespan; pre-reproductive mortality 
trajectories are also of interest but beyond the scope of this article. If 
distinguished by the length of life, then fast and slow life histories are 
scattered irregularly across Fig. 1. Lifespans range from 1,400 years for 
the hydra to just 25 days for nematode worms. Species with fast life his- 
tories, such as water fleas (Daphnia longispina), are followed in Fig. 1 
by species with slow life histories, such as the lion, and those with slow 
life histories, such as the chimpanzee, occur adjacent to those with fast 
life histories, such as the human louse (Pediculus humanus) and the 
fruitfly (D. melanogaster). Furthermore, species with very different life 
spans can display similar patterns of mortality, fertility and survivor- 
ship. For example, the water flea’s trajectories are similar to the fulmar’s, 
although water fleas reach advanced old age at 48 days, whereas the 
fulmars do so at 33 years. 

If senescence is measured by how long it takes for death rates to rise 
from some level to a higher level, then long-lived species senesce slowly. 
It is more interesting to define senescence by the sharpness or abrupt- 
ness rather than the speed of the increase in mortality. Baudisch® dis- 
tinguishes the pace of life; that is, whether reproduction is fast and life 
spans are short or reproduction is slow and life spans are long, from 
the shape of mortality and fertility trajectories (whether mortality rises 
sharply with age and fertility falls sharply or whether mortality and 
fertility levels are more constant). One measure of pace, the measure 
that we have used, is the terminal age to which only 5% of adults sur- 
vive; this measure is in days or years or some other unit of time. One 
measure of shape, the measure that we have used, is the ratio of mor- 
tality at the terminal age to the average level of adult mortality; this 
time-invariant measure does not change if time is measured in days 
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versus years. More senescent species, with sharper increases in mor- 
tality with age, have higher values of this measure of shape. 

The measure can be used to explore further the unexpected lack of 
association between the length of life and the degree of senescence. 
Among the first 24 graphs, those with the sharpest senescence, 11 species 
have relatively long life spans and 13 have relatively short life spans. 
Among the final 24 graphs, those with less senescence, 13 species have 
relatively long life spans and 11 have relatively short life spans. This 
weak negative association between the length of life and the degree of 
senescence is reflected in a weak Spearman rank correlation of —0.13, 
which is not significantly different from zero (P = 0.362). The Spear- 
man correlations are also non-significant when assessed for animals 
(P = 0.414) and for plants (P = 0.07) examined separately. If the 12 plants 
in Fig. 1 are cross-tabulated as longer or shorter lived, and as more or 
less senescent, then three species fall into each of the four categories. 
Hence the data support Baudisch’s® conjecture that pace and shape 
may be two orthogonal axes of life histories. 

A survivorship curve indicates the proportion of individuals that 
are still alive at a given age. In Fig. 1, we plot survivorship from repro- 
ductive maturity on a logarithmic scale. If mortality increases with age, 
the log-survivorship curve is concave. If mortality is independent of 
age, log-survivorship is linear (for example, roughly from the hydra to 
the red abalone (Haliotis rufesens) in Fig. 1). For species with death rates 
that decline with age, the curve is convex (for example, from the red- 
legged frog to the white mangrove at the bottom of Fig. 1). The classi- 
fication of survivorship curves into concave, linear and convex curves 
is known among biologists as type I, II and III, respectively'”"*, but 
normally the curves are plotted for lifespans starting at birth rather 
than at maturity. When the evolutionary theory of ageing** was being 
developed, there was very little empirical evidence for type III survi- 
vorship for adults and little evidence for type II survivorship. The wide- 
spread recognition that traditional theories of ageing predict adult 
senescence to be a universal trait led researchers to strive to find evi- 
dence for senescence in, for example, the mute swan (Cygnus olor)”. 
For this species, fertility does decline and mortality does increase at the 
oldest ages. However, the overall life course is characterized by fertility 
that increases and then slowly declines and by roughly constant mor- 
tality: the log-survivorship curve is nearly straight. It is clear from our 
analyses that the full spectrum of type I, II and III survivorship curves 
are found for adults in nature. 


Phylogenetic patterns 


Phylogenetic relatedness seems to have some role in the order of 
species in Fig. 1, as shown by taxonomic clustering of mortality, ferti- 
lity and survivorship patterns. All mammals are clustered in the top 
part of Fig. 1, whereas birds are somewhat more scattered, from the 
Bali mynah in the first row to the great tit in the seventh row. Amphi- 
bians and reptiles are found in the lower half of the panel, with flat 
mortality shapes and almost no overlap with mammals. In contrast, 
invertebrates are scattered across the continuum of senescence, with 
bdelloid rotifers and water fleas sharing the mammalian mortality 
pattern. The plants in our sample tend to occur lower in our ordering, 
with the first being Hypericum cumulicola. Although some angiosperm 
species seem to senesce*””’, many angiosperm species seem not to’, 
perhaps as an artefact of the use of stage-based data’®. The only alga in 
our data set, oarweed (Laminaria digitata), falls in the last row. 

Such clustering within broad taxonomic levels of kingdom (plants, 
animals), or class (mammals, birds), suggests that primitive traits related 
to the bauplan of species may have a pivotal role in determining pat- 
terns of ageing. In fact, the evolutionary conservatism of mechanistic 
determinants of ageing has been highlighted by genetic studies” and it 
has been suggested that asexual reproduction”, modularity”’, lack of 
germ-line sequestration from the soma”””’, the importance of protected 
niches”, regenerative capacity, and the paucity of diverse cell types”, 
may facilitate the escape from senescence in some clades. Many of the 
species in the lower half of Fig. 1—the reptiles, vascular plants, alga, 
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and coral—continue to grow after reproductive maturity to sizes much 
larger than those at maturity. For these indeterminate growers, mor- 
tality is approximately constant or decreases somewhat with age, whereas 
fertility is more or less constant or increases to some extent. Species 
with indeterminate growth may exhibit patterns of senescence that 
are fundamentally different from those of species with determinant 
growth’????, 

Approximately constant mortality and fertility are experienced by 
vertebrates such as collared flycatchers and red-legged frogs, inverte- 
brates such as hermit crabs (Pagurus longicarpus) and red abalone, 
and vascular plants such as great rhododendron (Rhododendron maxi- 
mum) and armed saltbush (Atriplex acanthocarpa), with the age at 5% 
survivorship ranging from 5 years for the collared flycatcher to 14 cen- 
turies for hydra. It remains to be seen whether the similarity of patterns 
of mortality, fertility and survivorship among disparate groups of species 
is a coincidence or represents convergent solutions to similar evolu- 
tionary challenges. 


Continuing the exploration of ageing 


Although hundreds of theories have been proposed to explain the 
proximate mechanisms of ageing**”*, theories to explain the ultimate 
evolutionary causes of the varieties of ageing, illustrated by the diverse 
range of trajectories in Fig. 1, are in their infancy. However, scattered 
studies suggest profitable directions for research. It is only recently that 
researchers have extended their analyses beyond the traditional age- 
structured framework**; more complex demographic models show that 
selection gradients in clonal or stage-structured organisms can be non- 
monotonic*’*°. As recognized in the disposable soma theory® differences 
in life-history constraints among species, and the resulting differences 
in optimal resource allocation among vital processes provide a pro- 
mising direction for explaining empirical observations about diverse 
fertility’**”-~°*"* and mortality**** trajectories. However, current 
theoretical approaches do not yet explain in detail why senescence has 
evolved in some species and not in others. Data sets that are currently 
available for research on ageing are taxonomically biased: high-quality 
data on hundreds of mammalian and bird species exist but data on 
other vertebrate taxa and on invertebrates are sparse. There is very limited 
knowledge of the age patterns of mortality and fertility in species of 
algae, fungi and bacteria’. 

The mortality and fertility trajectories of any species depend on the 
environment in which they are measured. Most human experience is 
bounded by the trajectories of modern Japanese and the hunter-gatherers 
in Fig. 1. Although population ecologists have long studied the res- 
ponses of mortality and fertility to environmental factors, few studies 
have focused on the shape of the age trajectories. Environmental and 
genotypic variation has been documented in laboratory studies of nema- 
tode worms, medflies, Drosophila and other model species*, and ina 
field study of Plantago”’. Available evidence suggests that variation can 
be considerable for a species but that the qualitative shapes of mortality 
and fertility trajectories are similar, as illustrated by humans in Fig. 1 
(see the Supplementary Note and Extended Data Fig. 1, which high- 
lights intraspecific variation in the mortality trajectories of laboratory 
rats (Rattus norvegicus) and mice (Mus musculus)). In addition to the 
lack of data for most species, and for variation within a species, little 
information is available on mortality at advanced ages beyond the age 
cut-off in Fig. 1. In the species for which such data are available, mor- 
tality approaches a plateau at the oldest ages (for example, for humans, 
fruitflies (D. melanogaster) and nematode worms) or declines (for Medi- 
terranean fruitflies, Ceratitis capitata)**’. The deceleration of mor- 
tality at high ages is more apparent if death rates are plotted on a log 
scale rather than the linear scale used in Fig. 1 (ref. 45). 

Deeper understanding of the evolutionary demography of ageing 
depends on the compilation of demographic data on diverse species 
investigated in the wild as well as in laboratories and zoos’, and on the 
development of more inclusive theories that can account for negligible 
and negative senescence** as well as for the steepness of deterioration 
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with age in senescent species. In such empirical and theoretical studies, 
researchers should guard against anthropocentric intuition about age- 
ing: humans, especially modern humans, are extreme outliers in Fig. 1. 


METHODS SUMMARY 


Selection of examples. We aimed to examine demographic trajectories for organ- 
isms across the tree of life. We therefore chose representative data sets compiled 
from the published literature for the major groups of organisms including verte- 
brate and invertebrate animals, plants and algae. Within the vertebrates we included 
exemplars of every major clade, including primates and other mammals, birds, 
reptiles, amphibians and fish. Representatives for the invertebrates included insects, 
molluscs, cnidarians and a crustacean. In the plant group we included both gym- 
nosperms and angiosperms and, finally, we included a green alga. We favoured 
data sets that covered longer time periods, with larger sample sizes and, when 
possible, we preferred data sets that included information on realized reproduc- 
tion and recruitment to those that simply recorded reproductive output. In addi- 
tion, for dioecious species, we favoured data sets for females. See Supplementary 
Methods 1 and 2 for details. 

Calculation of standardised trajectories. We classified the studies as: first, cohort 
studies; second, period studies with number at risk and numbers dying within a 
period; third, period studies depicting an age structure at a single point in time; 
or fourth, stage-structured population projection matrices (see Supplementary 
Methods 2 for details). We considered mortality and fertility trajectories from the 
age at maturity to the age at which 5% survivorship from maturity occurs. The 
trajectories of all data types, except the projection matrix data, were smoothed 
using P-splines””. We then calculated the force of mortality (1,) and fertility rate 
(m,) before standardizing them by dividing them by the respective averages, 
weighted by survivorship from maturity (1). 


Online Content Any additional Methods, Extended Data display items and 
Source Data are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Extended Data Figure 1 | Standardized mortality trajectories. The trajectories were smoothed using P-splines. We then calculated the force of 
a, Trajectories for laboratory rats. b, Trajectories for laboratory mice. Each line mortality (1,) and standardized it by dividing by the average value, weighted 
represents a different strain, sex or population (see Supplementary Methods for by survivorship from maturity (/,). Note that the sample sizes in most cases 
sources). We standardized the age axis to consider the trajectories from age were small (approximately 50 to 60 individuals) and thus random fluctuations 
at maturity to the age at which 5% survivorship from maturity occurs. may lead to erratic curves in some cases. 
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Elephant shark genome provides unique 
insights into gnathostome evolution 
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Martin F. Flajnik®, Yoichi Sutoh®, Masanori Kasahara®, Shawn Hoon’, Vamshidhar Gangu’, Scott W. Roy®, Manuel Irimia’, 
Vladimir Korzh'®, Igor Kondrychyn", Zhi Wei Lim', Boon-Hui Tay', Sumanty Tohari’, Kiat Whye Kong’, Shufen Ho’, 
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LaDeana W. Hillier'*, Patrick Minx'*, Thomas Boehm’, Richard K. Wilson“, Sydney Brenner! & Wesley C. Warren'* 


The emergence of jawed vertebrates (gnathostomes) from jawless vertebrates was accompanied by major morphological 
and physiological innovations, such as hinged jaws, paired fins and immunoglobulin-based adaptive immunity. 
Gnathostomes subsequently diverged into two groups, the cartilaginous fishes and the bony vertebrates. Here we 
report the whole-genome analysis of a cartilaginous fish, the elephant shark (Callorhinchus milii). We find that the 
C. milii genome is the slowest evolving of all known vertebrates, including the ‘living fossil’ coelacanth, and features 
extensive synteny conservation with tetrapod genomes, making it a good model for comparative analyses of gnathostome 
genomes. Our functional studies suggest that the lack of genes encoding secreted calcium-binding phosphoproteins in 
cartilaginous fishes explains the absence of bone in their endoskeleton. Furthermore, the adaptive immune system of 
cartilaginous fishes is unusual: it lacks the canonical CD4 co-receptor and most transcription factors, cytokines and 
cytokine receptors related to the CD4 lineage, despite the presence of polymorphic major histocompatibility complex 
class II molecules. It thus presents a new model for understanding the origin of adaptive immunity. 


The emergence of gnathostomes from jawless vertebrates marks a 
major event in the evolution of vertebrates. This transition was accom- 
panied by many morphological and phenotypic innovations, such as 
jaws, paired appendages and an adaptive immune system based on 
immunoglobulins, T-cell receptors and major histocompatibility com- 
plex (MHC) molecules’ (Fig. 1a). How these novelties emerged and how 
they facilitated the divergence, adaptation and dominance of gnathos- 
tomes as the major group (99.9%) of living vertebrates are key unresolved 
questions. The living gnathostomes are divided into two groups, the 
cartilaginous fishes (Chondrichthyes) and bony vertebrates (Osteichthyes), 
which diverged about 450 Myr ago (Fig. 1a). A key feature distinguish- 
ing the two groups is that chondrichthyans have largely cartilaginous 
endoskeletons whereas osteichthyans have ossified endoskeletons. Although 
fossil jawless vertebrates (for example galeaspids) and jawed vertebrates 
(for example placoderms) possessed dermal and perichondral bone, 
endochondral bone is found only in osteichthyans*. Chondrichthyans 
include about 1,000 living species that are grouped into two lineages, 
the holocephalans (chimaeras) and elasmobranchs (sharks, rays and 
skates), which diverged about 420 Myr ago® (Fig. 1a). A detailed whole- 
genome evaluation of a chondrichthyan and comparative analysis with 
the available genome information on osteichthyans and a jawless verte- 
brate’ might help us to understand features unique to chondrichthyans 
and provide insights into the ancestral state of gnathostome-specific 
morphological features and physiological systems. 

We previously identified C. milii, a holocephalan, as a chondrichthyan 
genome model*” because of its relatively small genome (~ 1.0 gigabase). 


Compared with elasmobranchs, the unique features of holocephalans 
include a single gill opening, a complete hyoid arch, fusion of the upper 
jaw to the cranium, and non-replaceable hypermineralized tooth plates’” 
(Fig. 1b). Callorhinchus milii inhabits temperate waters of the contin- 
ental shelves off southern Australia and New Zealand, typically at depths 
of 200 to 500 m (ref. 11). Here, we report the generation and analysis 
of a high-quality genome sequence of C. milii. Several key findings are 
presented here and further details on our in-depth characterization of 
this genome are presented in Supplementary Notes I to XI. 


Genome assembly and annotation 


Genomic DNA ofa single male C. milii was sequenced and assembled 
(Supplementary Note I) to a size of 0.937 gigabases, comprising 21,208 
scaffolds (N50 contig, 46.6 kilobases; N50 scaffold, 4.5 megabases; Sup- 
plementary Table I.1). The average GC content of the C. milii gnome 
is 42.3%, and approximately 46% of the genome is organized into iso- 
chores (Supplementary Note II). Using the Ensembl annotation pipe- 
line’* and RNA-seq transcript evidence, we predicted a total of 18,872 
protein-coding genes. In addition, microRNA (miRNA) genes were 
identified by small-RNA sequencing and annotation of the genome 
assembly (Supplementary Note III). Callorhinchus milii have more 
miRNA gene loci (693 genes and 136 families) than do teleosts (for 
example, zebrafish have 344 genes and 94 families) but fewer than do 
humans (1,527 genes and 558 families) and other mammals (mirBase 
release 19). Several novel C. milii-specific miRNAs are expressed at 
high levels in a tissue-specific manner (Supplementary Figs III.1 and 
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Figure 1 | Phylogeny of chordates. a, Major shared features of various 
vertebrate taxa; b, unique features of C. milii. Lampreys and hagfishes 
(cyclostomes) lack mineralized tissues. In contrast, cartilaginous fishes produce 
extensive dermal bone such as teeth, dermal denticle and fin spine. However, 
they lack the ability to make endochondral bone which is unique to bony 
vertebrates. Divergence times are from refs 6, 36. Ceno, Cenozoic era; 

PC, Precambrian era. 


IIL.2). Notably, a considerable number (16%; 22/136) of C. milii miRNA 
families conserved in mammals have been secondarily lost in teleosts. 
Their explicit tissue-specific expression patterns in C. milii and mam- 
mals suggest that they have important roles in gene regulation. A total of 
63,877 noncoding elements (average size, 271 base pairs) conserved 
between C. milii and bony vertebrates represent potential cis-regulatory 
elements (Supplementary Note IV). Surprisingly, only a tiny fraction 
(less than 1.0%) of these are found in the genomes of sea lamprey, sea 
squirt and amphioxus, suggesting that the emergence of gnathostomes 
was accompanied by major innovations in cis-regulatory elements and 
gene regulatory networks. 


Phylogenomics and evolutionary rate 


Morphological and palaeontological studies have placed Chondrichthyes 
as a sister group to bony vertebrates’. However, subsequent molecular 
phylogenetic analyses based on mitochondrial or nuclear genes have 
produced conflicting topologies'?'’. Using a genome-scale data set 
comprising 699 one-to-one orthologues from C. milii and 12 other 
chordates, we provide robust support for the traditional phylogenetic 
tree with an unambiguous split between Chondrichthyes and bony 
vertebrates (Supplementary Fig. V.1). Furthermore, analysis of gains 
and losses of introns provided independent support for Chondrichthyes 
as a sister group to bony vertebrates (Supplementary Note V). 
Previous studies based on a few mitochondrial and nuclear protein- 
coding genes indicated that the nucleotide substitution rate in elasmobranchs 
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is an order of magnitude lower than that in mammals'*’’. Using the 
genome-wide set of 699 orthologues, we estimated the molecular evolu- 
tionary rate of C. milii and compared it with other gnathostomes, with 
sea lamprey as the outgroup. Callorhinchus milii protein-coding genes 
have evolved significantly slower than all other vertebrates examined 
(P <0.01 for all comparisons; Supplementary Tables VI.1-V1.3), includ- 
ing the coelacanth, which has been considered to be the slowest evol- 
ving bony vertebrate’*. A neutral tree based on fourfold-degenerate 
sites indicated that the low evolutionary rate is a reflection of the 
neutral nucleotide mutation rate, and confirmed that the neutral evolu- 
tionary rate of C. milii is the lowest (Fig. 2a). 

The lower rates of molecular evolution of C. milii are also evident in 
the fewer changes in the intron-exon organization of its genes (Sup- 
plementary Note VII). Callorhinchus milii has experienced fewer 
intron gains or losses than any bony vertebrate since their divergence 
from the gnathostome ancestor (Fig. 2a). The highest rates of change 
were found in two teleost fishes, the stickleback and zebrafish, with the 
stickleback lineage experiencing the highest number of changes (603 
gains and 126 losses after it split from the zebrafish lineage) recorded 
in any vertebrate lineage (Fig. 2a). In addition to a lower rate of intron 
changes, the C. milii genome also has experienced a relatively low rate 
of major interchromosomal rearrangements, comparable to that of 
chicken, which has one of the most stable karyotypes among tetra- 
pods'’®*° (Supplementary Note VIII). An extensive conservation of 
synteny was observed in comparisons of C. milii scaffolds with chicken 
and human chromosomes, with a majority of C. milii scaffolds (93%) 
showing a one-to-one correspondence with chicken chromosomes 
(Supplementary Figs VIII.1 and VIII.2). A three-way comparison between 
C. milii, chicken and teleosts (medaka and zebrafish) revealed that 
teleosts have undergone a substantially higher number of interchromo- 
somal rearrangements than previously demonstrated by simple tetrapod- 
teleost comparisons (Fig. 2b and Supplementary Tables VIIL7 and 
VIIL8). 


Evolution of protein-coding gene families 

Protein domains 

Comparisons of Pfam domains (Supplementary Note Xa) identified 17 
C. milii domains that are missing in bony vertebrates (Supplementary 
Table IX.2). Sixteen of these are present in amphioxus or other eukar- 
yotes, and are thus ancient protein domains that are still retained in 
C. milii but have been lost from bony vertebrates. We note that 13 
domains shared between C. milii and tetrapods are absent in teleosts 
(Supplementary Table [X.4), indicating that they have been secondarily 
lost from the teleost lineage. 


Lineage-specific gene losses 

Orthologues of more C. milii genes were found to be lost from the 
teleost lineage (271 genes; Supplementary Note [Xb and Supplemen- 
tary Table IX.6) relative to the tetrapod lineage (34 genes; Supplemen- 
tary Table IX.7). Human orthologues of many genes lost from teleosts 
are associated with genetic diseases (104 genes, 38%; Supplementary 
Table IX.6), indicating their importance for human physiology. The 
loss of these genes from teleosts supports the idea that teleosts repres- 
ent a more derived group than do other gnathostomes. The functional 
annotation of zebrafish orthologues of the 34 genes lost from tetrapods 
highlighted several genes that are specific to the aquatic lifestyle, such 
as those regulating fin and lateral-line development and those encod- 
ing receptors for water-soluble odorants (Supplementary Table IX.7). 


Genetic basis of bone formation 


Bone is the most widespread mineralized tissue in vertebrates, and its 
formation represents a major leap in vertebrate evolution. Although 
chondrichthyans produce dermal bone (for example teeth, dermal 
denticles and fin spines) and calcified cartilage*”', unlike bony verte- 
brates, their cartilage is not replaced with endochondral bone. Among 
vertebrates, the earliest mineralized tissue was found in the feeding 
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Figure 2 | Callorhinchus milii possesses the slowest evolving vertebrate 
genome. a, Neutral tree of 13 chordates based on fourfold degenerate sites. 
Pairwise distances to amphioxus are shown for each species above their 
respective branches. Intron gain/loss (+/—) events are shown in red below 
taxon labels and at ancestral nodes. b, Circos plots showing syntenic 


apparatus of extinct jawless fishes, the conodonts”’. Early dermal bone 
was found in extinct jawless vertebrates such as heterostracans, 
whereas perichondral bone surrounding the cartilage was found in 
several fossil jawless vertebrates (osteostracans, galeaspids) and jawed 
vertebrates (placoderms, acanthodians; Fig. 3). However, the highly 
complex process of endochondral ossification is unique to bony verte- 
brates. The C. milii genome sequence provided a unique opportunity to 
address the question of why the endoskeleton of chondrichthyans is 
not ossified. 
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Figure 3 | Genetic events underlying the emergence of bone formation in 
vertebrates. Duplication of Sparc by whole-genome duplication initially gave 
rise to Sparcl1, and the subsequent tandem duplication of Sparcll gave rise to 
the SCPP gene family responsible for endochondral ossification. Because the 
sea lamprey genome contains only Sparc but no Sparcll (ref. 22), we have placed 
the genome duplication event that gave rise to Sparcl1 after the divergence of 
jawless vertebrates from the jawed vertebrate ancestor. The sister relationship 
of chondrichthyans and acanthodians is based on ref. 37. SCPP, secretory 
calcium-binding phosphoprotein gene family member. , extinct. 
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relationships between C. milii scaffolds and chicken and zebrafish 
chromosomes. Most C. milii scaffolds show a one-to-one relation with chicken 
chromosomes whereas they show a correspondence of one-to-two, or more, 
with zebrafish chromosomes. Chicken microchromosomes are labelled in red. 


We searched the C. milii genome assembly and transcriptomes 
for genes known to be involved in bone formation in osteichthyans 
(Supplementary Note X). All gene family members involved in bone 
formation were present, except the secretory calcium-binding phos- 
phoprotein (SCPP) gene family (Supplementary Table X.1). This gene 
family encodes a diverse array of secreted phosphoproteins that arose 
from the gene Sparc-like 1 (Sparcl1) through tandem duplication, and 
Sparcll itself arose from an ancient metazoan gene, Sparc, through 
whole-genome duplication”. There are two main categories of SCPP 
genes: one group encodes acidic proteins and the other encodes proline- 
and glutamine-rich (P/Q-rich) proteins. In the human genome, the two 
groups are found in two different clusters on chromosome 4; the acidic 
SCPP genes (SPP1, MEPE, IBSP, DMP1 and DSPP, collectively known 
as SIBLING genes) occur between PKD2 and SPARCL1, whereas the 
P/Q-rich SCPP genes are found in the enamel matrix protein-SCPP 
cluster ~17 megabases downstream of SPARCL1 (Supplementary Fig. X.2). 
Acidic SCPP or SIBLING genes are involved in the ossification of col- 
lagenous matrix in bone and dentine, and P/Q-rich SCPP genes are 
involved in the production of enamel, milk, tears and saliva. Although 
there are variable numbers of P/Q-rich SCPP genes in teleosts”*, there isa 
single SIBLING gene, spp1, in zebrafish and medaka. Zebrafish spp1 
(also known as osteopontin) is expressed specifically in osteoblasts” 
and has therefore been proposed to have a primary function in bone 
formation similar to its mammalian orthologue”. 

The C. milii genome contains both Sparc and Sparcll on different 
scaffolds that show extensive conserved synteny with orthologous loci 
in human and other bony vertebrates (Supplementary Figs X.1 and 
X.2). However, there is no SCPP gene cluster in the intergenic region 
between Pkd2 and Sparcll or elsewhere in the genome (Supplementary 
Fig. X.2). The genomic and transcriptomic resources available for other 
cartilaginous fishes such as the little skate (Leucoraja erinacea) and the 
small-spotted catshark (Scyliorhinus canicula) as well as the genome 
assembly of the sea lamprey, a jawless vertebrate, also do not contain 
any SCPP genes (Supplementary Note X). These findings suggest that 
the tandem duplication of Sparcll that gave rise to SCPP genes 
occurred in the common ancestor of osteichthyans after this lineage 
split from the chondrichthyan lineage (Fig. 3). Because SCPP genes 
have a crucial role in the formation of bone, we propose that their 
absence from C. milii explains the absence of bone from the endoskel- 
eton of cartilaginous fishes. 

To test this hypothesis, we used two different methods to disrupt the 
function of the single bone-specific SIBLING gene spp1 in zebrafish. 
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The knockdown of spp! using two gene-specific morpholinos (ATG 
MO and E2-12 MO) resulted in a significant reduction in endochondral 
and dermal bone formation by comparison with embryos injected with 
5-base-pair-mismatch control morpholinos (Supplementary Fig. X.4). 
Unlike the transient effects exerted by morpholinos, the genetic inter- 
ference afforded by the CRISPR/Cas9 system” results in heritable geno- 
mic modifications; indeed, targeting exons 6 and 7 of the spp1 gene using 
the CRISPR/Cas9 system resulted in the generation of specific insertion/ 
deletion mutations at the target sites, including a ~2.6-kilobase dele- 
tion when exons 6 and 7 were simultaneously targeted (Supplementary 
Fig. X.6). Embryos 5 days post-fertilization (dpf) with deletions in 
exon 7 alone or in both exon 6 and exon 7 of spp1 showed a significant 
reduction in the formation of endochondral and dermal bone (Fig. 4), 
with the defect in bone formation persisting in 15-dpf mutant embryos 
(Supplementary Fig. X.9). The similar phenotypes obtained using two 
different methods of manipulation indicate that the effects on bone 
formation are specific, and strongly suggest that spp] has an essential 
role in the modulation of bone formation in zebrafish. 

The results of the zebrafish spp1 knockdown experiments provide 
strong support for our hypothesis that the absence of SCPP genes 
from cartilaginous fishes is related to the unossified nature of their 
endoskeleton. In turn, the absence of SCPP genes from chondrichth- 
yans raises questions about the genetic basis of dermal and perichon- 
dral bone formation in chondrichthyans, placoderms, acanthodians 
and extinct jawless vertebrates. We speculate that one or more SCPP- 
related genes, probably Sparc, Sparcl1 or both, mediate the minerali- 
zation of skeleton in these vertebrates. 


Primordial adaptive immune system 
The chondrichthyan immune system shares many features of the 
innate and adaptive immune systems of mammals’’®. However, 
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Figure 4 | Targeted mutagenesis of zebrafish spp1 by sgRNA:Cas9 results in 
reduced bone formation. a, spp1 is specifically expressed in cells surrounding 
the bone matrix. Ventral view of a 5-dpf embryo hybridized with a spp1-specific 
RNA probe. Yellow labels, endochondral bones (cb5, ceratobranchial 5; ch, 
ceratohyal); white labels, dermal bones (bsr, branchiostegal ray; cl, cleithrum; 
d, dentary; en, entopterygoid; op, operculum; ps, parasphenoid). b, Ventral 
view of a 5-dpf wild-type (WT) embryo stained with alizarin red to reveal sites 
of bone deposition (red fluorescence). mx, maxilla. c, Bright-field image merged 
with b to visualize anatomical structures and locations of bone deposition 
simultaneously. d, Ventral views of 5-dpf embryos injected with Cas9 mRNA 
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several important differences, confirmed by transcriptome analysis of 
an elasmobranch cartilaginous fish, the nurse shark (Ginglymostoma 
cirratum), which diverged from the C. milii lineage ~420 Myr ago’, high- 
light several unexpected features of the primordial state of gnathostome 
immune systems, especially for adaptive immunity (Supplementary Note 
XI, Supplementary Figs XI.1-XI.10 and Supplementary Tables XI.1- 
XI.13). First, the genome assembly suggests a close linkage of immuno- 
globulin and T-cell receptor genes, compatible with the notion that the 
somatically diversifying antigen receptor genes evolved from a common 
ancestor via en bloc duplications’’. Indeed, the presence of the variant 
antigen receptor NAR-TCR, but lack of other specialized forms of anti- 
gen receptors, such as IgW and IgNAR in the C. milii genome suggest that 
the single-domain V region, subjected to somatic diversification by Rag 
proteins, was at first part of a T-cell receptor (TCR)-like structure before 
being co-opted by immunoglobulins. Second, the linkage of antigen 
receptor genes with certain MHC genes, whose products functionally 
interact in regulating the immune response, supports the co-evolutionary 
origin of antigen presentation and recognition**”’. Third, the immuno- 
genome of C. milii is compatible with the presence of cytotoxic natural- 
killer and CD8* T cells; by contrast, the absence of the canonical CD4 
co-receptor, the transcription factor RORC, several cytokines and cyto- 
kine receptors that are associated with helper and regulatory functions of 
CD4* T cells in mammals (Fig. 5) suggests the presence of a primordial 
type of helper function in cartilaginous fishes. Restricted helper functions 
(exemplified by the lack of the T-follicular helper lineage) in cartilaginous 
fishes may explain the long lag time required to generate humoral immun- 
ity (affinity maturation and memory) in sharks”. Thus, the emerging 
adaptive immune system seems to have been characterized by the 
presence of a full-blown cytotoxic system and a primordial helper 
system that was geared towards a T};1-type response. 
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together with single guide RNA (sgRNA) targeting spp1 exon 6, exon 7 or both 
(alizarin red staining). The embryos were scored as normal (resembling wild 
type), mild or strong bone phenotypes, with the latter showing the greatest 
reduction in bone formation. The variations in the extent of bone reduction are 
presumably due to somatic chimaerism with regard to spp1 disruption. 

e, Proportions of mild and strong bone phenotypes induced by disruption of 
spp1 by sgRNA/Cas9. Targeting of exon 7 (n = 206 embryos) or both exons 6 
and 7 (n = 143) resulted in significantly higher proportions of strong bone 
phenotype (P< 0.01, Fisher’s exact test) compared with control injections of 
Cas9 mRNA (n = 190) and exon 7 sgRNA (n = 143) (Ex6, Cas9: n = 72). 
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Figure 5 | T-cell lineages in mammals and cartilaginous fishes. a, Schematic 
diagram of mammalian CD8* cytotoxic cells producing various effector 
molecules, such as PRF1 (perforin), GZM (granzymes), INF-y (interferon-y) 
and TNF-« (tumour necrosis factor-a). Interleukins 7 (IL-7) and 15 (IL-15) are 
indispensable for induction of the transcription factor RUNX3, which controls 
expression of the genes encoding the CD8 co-receptor that interacts with the 
TCR and MHC classI molecules. All key elements of this T-cell lineage are 
present in the genomes of cartilaginous fishes. b, Diversification of CD4 


Despite the apparent lack of Teg cells in cartilaginous fish, the elim- 
ination of self-reactive lymphocytes during the process of central tol- 
erance seems to be a primordial vertebrate feature, as demonstrated by 
the presence of AIRE, which is responsible for the expression of peri- 
pheral self-antigens for presentation via MHC to T cells in the medulla 
of the thymus". The absence of the gene encoding a bona fide CD4 co- 
receptor despite the presence of polymorphic MHC class II genes sug- 
gests the presence of unusual types of CD8 helper T cells, capable of 
exerting T}1-like activities through IFN-y, IL-12 and TNF-« (Fig. 5). 
We suggest that such cells are capable of recognizing exogenous anti- 
gens presented by MHC class IT molecules but might also interact with 
other antigens in an MHC-independent, antibody-like recognition 
mode, as recently demonstrated in mice deficient in both CD8 and 
CD4 co-receptors*. It is to be noted that the secondary loss of CD4, 
MHC class II and invariant chain genes in a teleost, the Atlantic cod 
(Gadus morhua), is accompanied by compensatory changes, such as 
amplification of MHC class I and Toll-like receptor genes*’, most prob- 
ably related to the disappearance of T-helper lineages only in this spe- 
cies. In stark contrast, the lack of such compensatory features in 
cartilaginous fishes, demonstrated in two highly divergent species, 
C. milii and G. cirratum, supports the hypothesis that the features 
described here are primordial. Our data also suggest that CD8 and 
CD4 were co-opted as co-receptors for the TCR at different times in 
evolution, and that T cells recognizing MHC class II molecules in car- 
tilaginous fish recruit the downstream signalling components to the 
immunological synapse entirely through the TCR complex. 


Conclusion 

Among the three living lineages of vertebrates (cyclostomes, cartilagin- 
ous fishes and bony vertebrates), bony vertebrates are the largest and 
most diverse group of vertebrates. Because cartilaginous fishes are the 
sister group of bony vertebrates, they constitute a critical outgroup for 
understanding the evolution and diversity of bony vertebrates. The 
whole-genome analysis of C. milii, a holocephalan cartilaginous fish, 
shows that the C. milii genome is evolving significantly slower than other 
vertebrates, including the coelacanth, which is considered a ‘living fossil’. 
Although several physiological and environmental factors have been 
proposed to explain the interspecific variation in molecular evolutionary 
rates***°, the factors contributing to the lower evolutionary rate of 
C. milii are not known. Overall, the C. milii genome is the least derived 
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lineages in mammals. The transcription factor ThPOK controls expression of 
the co-receptor CD4, which interacts with the TCR and MHC class II molecules 
and represses the CD8 lineage programme. The listed inducer cytokines 
activate key transcriptional regulators that define the multiple CD4” lineages 
specialized for producing various effector molecules; genes not found in 
cartilaginous fishes are struck through. Cartilaginous fishes probably possess 
only Ty1 cells (red rectangle) and lack other helper subsets. TGF-B, 
transforming growth factor-B. 


among known vertebrates and is therefore a good model for inferring the 
state of the ancestral chondrichthyan and gnathostome genomes. Its 
value for comparative genomic studies is illustrated by our analysis of 
genetic events that led to the ossification of endoskeleton in bony verte- 
brates. Unexpected was our finding that the adaptive immune system of 
cartilaginous fishes possesses highly restricted subsets of T helper cells 
(perhaps only one) with unconventional antigen-binding properties; 
this suggests that helper and regulatory functions of T cells that recog- 
nize MHC class II molecules became more elaborate in the ancestor of 
bony vertebrates through the acquisition of transcription factors such as 
RORC, the CD4 co-receptor, conventional FOXP3 and a host of CD4- 
lineage-specific cytokines and cytokine receptors. Thus, the whole-genome 
analysis of C. milii provides fresh insight into the mechanism of bone for- 
mation and the origin of adaptive immunity of gnathostomes. 


METHODS SUMMARY 


Genomic DNA was obtained from the testis of a single C. milii caught in Tasmania, 
Australia, and used to prepare libraries with inserts of different sizes. Sequencing was 
conducted on the Roche 454 GS FLX Titanium platform (for shotgun fragments and 
3-kilobase and 8-kilobase inserts) and the ABI3730 instrument (for plasmid, fosmid 
and BAC ends). The C. milii genome was assembled using CABOG v6.1. PolyA- 
selected RNA from ten tissues of C. milii (brain, gills, heart, intestine, kidney, liver, 
muscle, ovary, spleen and testis) and the spleen and thymus of G. cirratum were 
sequenced on the Illumina GAIIx platform. Transcripts were assembled de novo 
using Trinity r2011-07-13. MicroRNA genes were identified by deep sequencing of 
small RNA from 16 tissues (brain, blood, eye, gills, heart, intestine, kidney, liver, 
muscle, ovary, pancreas, rectal gland, skin, spleen, testis and uterus) on the Illumina 
GAIIx platform. Annotation of the genome was carried out using the Ensembl gene 
annotation pipeline which integrated ab initio gene predictions and evidence-based 
gene models. The gene set can be viewed at http://esharkgenome.imcb.a-star.edu.sg/. 
Annotation of protein domains in the C. milii proteome was carried out by searching 
it against the Pfam v26 database using HMMER v3.0. Zebrafish spp1 was knocked 
down using morpholinos or the CRISPR/Cas9 system. Methods are described in 
detail in individual sections of the Supplementary Information. 

Allanimals were cared for in strict accordance with National Institutes of Health 
(USA) guidelines. The protocol was approved by the Institutional Animal Care and 
Use Committee of the Biological Resource Centre, A*STAR (protocol #100520). 
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Patterning and growth control by 
membrane-tethered Wingless 


Cyrille Alexandre'*, Alberto Baena-Lopez'* & Jean-Paul Vincent! 


Wnts are evolutionarily conserved secreted signalling proteins that, in various developmental contexts, spread from 
their site of synthesis to form a gradient and activate target- gene expression at a distance. However, the requirement for 
Wnts to spread has never been directly tested. Here we used genome engineering to replace the endogenous wingless 
gene, which encodes the main Drosophila Wnt, with one that expresses a membrane-tethered form of the protein. 
Surprisingly, the resulting flies were viable and produced normally patterned appendages of nearly the right size, albeit 
with a delay. We show that, in the prospective wing, prolonged wingless transcription followed by memory of earlier 
signalling allows persistent expression of relevant target genes. We suggest therefore that the spread of Wingless is 
dispensable for patterning and growth even though it probably contributes to increasing cell proliferation. 


Wnts are secreted signalling proteins’” that have been suggested to act 
at a distance to control patterning and growth during development**. 
Long-range Wnt activity has been most extensively studied in wing 
imaginal discs of Drosophila**. These epithelial pockets, set aside in 
the embryo, grow from approximately 50 to 50,000 cells during larval 
stages to give rise to fully patterned wings during pupariation’°. During 
early larval life, Wingless (Wg), the main Drosophila Wnt, is initially 
expressed throughout the prospective wing field to establish wing prim- 
ordium'’’™*. At subsequent stages, Wg is produced in a narrow stripe of 
cells at the dorsoventral boundary. It is thought that from there it spreads 
throughout the prospective wing blade and activates target-gene expres- 
sion in a concentration-dependent manner**. A common view is that, 
near the dorsoventral boundary, high signalling activity activates senseless 
expression, which contributes to wing margin fates'*’’,whereas further 
away, in the prospective wing blade (up to 50 cell diameters), low-level 
signalling stimulates the expression of more sensitive target genes like 
vestigial, Distal-less and frizzled 3 (refs 4, 5, 18, 19) and promotes growth”>”. 
Although the importance of graded W¢ signalling has been disputed”’, 
it is generally accepted that Wg needs to spread over the whole wing field 
for patterning and growth. To test this assumption directly, we sought 
to modify the wingless (wg) locus so that membrane-tethered Wg would 
be expressed at physiological level in the absence of the wild-type form. 


Patterning and growth by membrane-tethered Wingless 
A fusion protein comprising the type 2 transmembrane protein Neurotactin 
and Wg (NRT-Wg) has been previously shown in clonal overexpres- 
sion assays to activate target genes only within expressing cells and in 
surrounding adjoining cells**””’. As a first step towards introducing a 
complementary DNA encoding NRT-Wg into the wg locus, we used 
an improved protocol for homologous recombination” to delete the 
wild-type ATG-containing exon and replace it with a cassette that includes 
an attP recombination site, a mCherry-encoding cDNA, and other ele- 
ments (Extended Data Fig. 1a). As expected, in heterozygous larvae, 
Cherry was expressed in a pattern that broadly recapitulated that of 
endogenous wg (Extended Data Fig. 1b). Most extraneous genetic ele- 
ments were then removed, leaving only the attP site and a single LoxP 
site at the locus (Extended Data Fig. 1a). The resulting allele behaved as 
a null and is therefore referred to hereafter as wg knockout (wg{KO}) 


(Extended Data Fig. 1a). A wild-type Wg cDNA was inserted into the 
attP site, along with mini-white as a genetic marker, and the resulting 
flies (wg{KO; Wg}) were indistinguishable from wild-type flies, con- 
firming faithful expression of the reintegrated cDNA (Extended Data 
Fig. 1d, f, g). Next, acDNA encoding haemagglutinin-tagged NRT-Wg 
was introduced at the targeted locus, using either mini-white or pax- 
Cherry as a genetic marker, to generate wg{KO; NRT- We} (Extended 
Data Fig. le; see Methods for details about genetic marker usage through- 
out this manuscript). Notably, when grown without crowding, homo- 
zygous animals that have NRT-Wg as their sole source of Wg were 
viable and had normally patterned appendages and other cuticular 
structures (Fig. la, b and Extended Data Fig. 1f-o). The wings were 
slightly smaller than those of control siblings (10-12% surface area 
reduction; Extended Data Fig. 1f-i). However, the arrangement of 
margin sensory bristles appeared normal (as shown at high magnifica- 
tion in Extended Data Fig. 1j-l), an indication of appropriate senseless 
expression during imaginal development. Other target genes, vestigial 
(not shown), Distal-less (immunofluorescence) and frizzled 3 (from a 
gene trap) were mildly affected, with their expression dropping off 
more sharply than in controls but without a noticeable consequence 
on adult wing patterning (Fig. 1b). Importantly, these reporters of Wg 
signalling continued to be expressed in a broad domain straddling the 
dorsoventral boundary (Fig. 1c-f). 


Signalling by Neurotactin- Wingless is juxtacrine 

A key question is how NRT-Wg might activate target genes seemingly 
at a long range. Staining of homozygous we{KO; NRT- Wg} imaginal 
discs with anti-Wg showed a restricted distribution by comparison to 
that in control discs although weak staining could be detected at the 
surface of cells located on either side of the main expression domain 
(Fig. 1g, h). To assess whether this could be due to protein perdurance 
or release (for example, by cleavage or on exosomes), we devised a 
system for timed removal of the NRT-Wg cDNA. A wg{KO; FRT 
NRT-Wg FRT} allele was created (FRT, Flp recombinase target) 
and combined with hedgehog-gal4, UAS-flp and tubulin-gal80*. 
Transferring the resulting larvae from 18°C to 29 °C triggered exci- 
sion of the FRT NRT-Wg FRT cassette in patches of cells within the P 
compartment. No anti-haemagglutinin immunoreactivity (that is, 


1MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK. 
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a % ; Control b 


wg{KO; NRT-Wg} 


Control 


Figure 1 | Characterization of membrane-tethered Wingless expressed 
from the wingless locus. a, Homozygous wg{KO; NRT- Wg} flies and one 
control (wg{KO; NRT-Wg}/GlaBC) fly. b, Wings from a control (wg{KO; 
NRT-Wg}/GIaBC, and a homozygous wg{KO; NRT-Wg}fly (see size 
information and details of the wing margin in Extended Data Fig. 1i-l). 

c, d, Expression of Frizzled-3—GFP and Distal-less (Dll) in control and 
homozygous we{KO; NRT-Wg} imaginal discs. e, f, Fluorescence intensity 
profile (grey value; arbitrary units) along lines connecting red arrowheads. 


NRT-Wsg) could be detected in the patches 12 h after the temperature 
shift (red and white arrows in Fig. 1i). As residual Gal80"° activity 
perdures as late as 5 h after shifting to 29 °C (ref. 25), we estimate that 
NRT-Wg persists for less than 7 h and is therefore not exceptionally 
stable. Importantly, lack of anti-haemagglutinin staining within the 
patches also shows that no detectable Wg is released from NRT-Wg- 
expressing cells. The range of NRT-Wg was functionally assessed by 
analysing the expression of wg target genes (senseless and Distal-less) in 
wg-null mutant clones generated in a we{KO; NRT-We} background 
(‘null-in-NRT’ clones). For comparison, wg null mutant clones were 
also induced in a wild-type background (‘null-in-WT’ clones). In both 
cases, senseless was activated within one cell diameter from the clone 
edge (Extended Data Fig. 2a, b). Therefore, NRT-Wg achieves the level 
of signalling required to activate senseless expression. Next we assessed 
the range of NRT-W¢ with Distal-less, a target gene thought to require 
lower signalling activity. Null-in-NRT clones located in region 1 
(clones located elsewhere will be addressed later) failed to maintain 
Distal-less expression except in adjoining cells (Fig. 2b, red arrows). By 
contrast, Distal-less expression was maintained throughout similarly 
sized and positioned null-in- WT clones (Fig. 2a, turquoise arrowhead). 
These results confirm that wild-type Wg can spread and signal over a 
few cell diameters’ and that, in region 1, Wg signalling is continuously 
required for high Distal-less expression. Importantly, they also dem- 
onstrate that no active Wg or other functional Wnts are released from 
the NRT-Wg-expressing cells. 


Prolonged transcription and cellular memory 


So far we have shown that NRT-Wg does not act beyond adjoining 
cells and yet, extensive genetic analysis has shown that W¢ signalling 
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g, h, Wg immunoreactivity in a control (wild type) and a homozygous 
we{KO; NRT-We}disc, with corresponding transverse sections (along line 
connecting white marks). i, Excision of haemagglutinin (HA)-tagged 
NRT-Wsg (red and white arrows) from wg{KO; FRT NRT-Wg FRT QF} 
with Flp produced from hedgehog-gal4 tubulin-gal80" UAS-flp at 29°C. 
Low-magnification frontal view and high-magnification transverse section 
along line defined by white marks are shown. 


is required within the prospective wing blade, seemingly far from the 
source of Wg*>*°, The combined effect of two processes provides a 
solution to this paradox. One such process was revealed by careful 
analysis of the pattern of wg transcription. This was facilitated by insert- 
ing a Gal4-encoding cDNA into the attP site of the wg{KO} allele, and 
crossing the resulting flies (wg{KO; Gal4}) to flies carrying UAS-GFP 
(Fig. 3a). GFP (green fluorescent protein) signal could be seen through- 
out the pouch until mid-third instar stage, suggesting that the “Wg 
target cells’ are indeed transcribing wg during this period of develop- 
ment. As the perdurance of GFP could lead to an overestimate of the 
time when transcription terminates, we devised a Flp-gated reporter of 
wg transcription, comprising wg{KO; FRT Wg FRT QF}, hs-Flp and 
QUAS- Tomato (Fig. 3b). In such larvae, excision of the FRT Wg FRT 
cassette by heat shock triggers expression of the heterologous tran- 
scription factor QF, and hence QUAS-Tomato, in cells that are tran- 
scribing wg at the time. As shown in Fig. 3b, heat shocks at early-third 
instar led to clones throughout the prospective wing, while clones 
induced later became spatially restricted. A similar behaviour was seen 
with another reporter designed to provide a permanent record of transcrip- 
tional activity at experimentally defined times (Extended Data Fig. 3b, c). 
To confirm directly that wg is transcriptionally active, imaginal discs 
were processed for fluorescence in situ hybridization (FISH) with a wg 
probe. Specific signal (using the prospective notum as a negative con- 
trol) was seen in the prospective wing region as late as 84 + 3 hafter egg 
laying (Fig. 3c; see also ref. 13). Therefore, multiple experimental 
approaches showed that wg transcription is active throughout the wing 
pouch during a critical period of patterning and growth (see also ref. 14), 
but then recedes towards the dorsoventral boundary during the first 
half of third instar. Imaginal discs of larvae carrying wg{KO; Gal4}and 
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N RT-Wg background 


Figure 2 | Gene expression in wingless-null mutant patches surrounded 
by wild-type or Neurotactin-Wingless-expressing cells. a-c, Expression 

of Dll and Wg, as detected by immunofluorescence, in ‘null-in-wt’ (a) or 
‘null-in-NRT’ (b, c) clones. Null wg mutant territory (wg) is marked by the 
absence of GFP. Red arrows indicate short-range activation of Dil expression by 
NRT-Wg. Turquoise arrowhead highlights longer range of wild type Wg. 
Purple arrowhead and yellow asterisk indicate persistent expression (memory) 


UAS-HRP-CD8-GFP, a particularly stable reporter”’, vividly illustrate 
that all the cells of the prospective wing blade derive from cells that 
express wg during early-third instar (Fig. 3d). The wg transcriptional 
program could provide a local source of low-level protein in the pro- 
spective wing blade during the first part of third instar. 
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in zones 2 and 3, respectively. d, Diagram showing regions in which the 
behaviour of null mutant clones was analysed. Region 1 links the prospective 
wing to the prospective hinge. Region 2 represents the domain straddling the 
dorsoventral boundary that expresses high-level Dll. Region 3 represents the 
prospective wing blade, which expresses graded, low-level Dll. Age differences 
within the third instar account for disparities in disc size between b (younger) 
and c (older). 


Although wg transcription in the prospective blade extends later 
than previously thought, it does terminate around the mid-third instar. 
Therefore an additional mechanism is needed to sustain growth and 
patterning without releasable Wg beyond this stage. The nature of this 
mechanism is suggested from the behaviour of vestigial and Distal-less 


Figure 3 | Activity of the wingless 
promoter in the prospective wing. 
a, GFP expression driven by 

wg{KO; Gal4} at different stages. 
AEL, after egg laying. b, QUAS- 
Tomato-expressing clones in larvae 
of the genotype hs-Flp QUAS- 
Tomato; wg{KO; FRT NRT-Wg FRT 
QF}/Cyo. Larvae were heat shocked 
18 h before fixation at the times 
indicated. White arrows indicate the 
presence of Tomato-expressing cells 
in the middle of the prospective wing 
blade, far from the prospective 
margin. c, FISH with a wg probe in 
wing imaginal discs obtained 84 + 3 
h AEL. Red arrow indicates incipient 
wg expression in prospective thorax. 
White marks indicate the plane of 
transverse section. White arrow 
shows FISH signal in the apical 
domain. d, HRP activity (detected by 
DAB staining) in a late-third instar 
wg{KO; Gal4}/+, UAS-HRP-CD8- 
GFP wing imaginal disc. Activity is 
a detectable throughout the pouch 
(black arrow) but not in the 
prospective thorax (red arrow). 


wg mRNA 
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in response to changes in Wg signalling. As expected from classical target 
genes, these two genes are activated by ectopic signalling activity and 
their expression terminates in small patches of cells that are made unable 
to respond to W¢ (for example, by genetic removal of the receptors)**””. 
However, unlike classical target genes, their expression persists, albeit 
at a reduced level, when signalling is prematurely terminated through- 
out the disk or a compartment*’”®. Such persistence can be seen for 
Distal-less in regions 2 and 3 of null-in- NRT mosaic discs. For example, 
in clones that transect the dorsoventral boundary in the central portion 
of the disc (region 2), Distal-less expression was reduced but not elimi- 
nated (for example, Fig. 2c, purple arrowhead). Note that high Distal- 
less continued to be expressed in null-in-NRT cells located at the edge 
of the clones, both in region 1 and in region 2 (Fig. 2b, c, red arrows). By 
contrast, high Distal-less expression was maintained throughout simi- 
larly sized null-in-WT clones (Fig. 2a). These observations further con- 
firm that the NRT territory only triggers juxtacrine signalling. In region 3, 
further away from the dorsoventral boundary, Distal-less expression 
was essentially unchanged in ‘null-in-NRT” clones (Fig. 2c, yellow aster- 
isk), again confirming that Distal-less expression persists in this region 
even after removal of W¢ signalling. Persistent target-gene expression 
within ‘null-in-NRT’ clones could explain their continued albeit slower 
(in comparison with null-in- WT) growth (Extended Data Fig. 2c-f). 
Memory of earlier signalling, which has been suggested previously for 
Dpp”*, could involve classical epigenetic control. Indeed, the expression 
of a vestigial reporter construct has been shown to be influenced by the 
presence or absence of Polycomb response elements”. Autoregulation 
could also contribute to the sustained expression of target genes”. 
Irrespective of the underlying mechanism, persistence of target-gene 
expression could explain, in part, why we{KO; NRT- We} discs con- 
tinue to grow beyond the time when wg transcription terminates in the 
prospective wing blade. It also suggests that during normal development, 
target-gene expression within the prospective blade (region 3) does not 
necessarily require Wg to spread from the dorsoventral boundary. 
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Organ growth and developmental timing 


Even though wg{KO; NRT-Wg} homozygous flies are morphologi- 
cally normal, larvae of this genotype entered pupariation with a delay 
compared to controls (Fig. 4a). In addition, although they eclosed at a 
near normal frequency when cultured on their own, they largely failed 
to do so in co-culture with controls (Fig. 4b); a strong indication of reduced 
fitness. As imaginal disc damage can delay larval development”, it 
is conceivable that preventing Wg release could affect developmental 
timing of the larva indirectly by slowing down disc development. To 
test this possibility, we developed a method for tissue-specific allele 
switching in second instar imaginal discs** (Extended Data Fig. 4a, b). 
Thus, we created larvae expressing NRT-Wg only in imaginal discs 
(wg{KO; Wr. NRT-Wg?**; Fig. 4c, d). Larvae of this genotype 
developed at the same rate as controls (Extended Data Fig. 4c) and 
eclosed at near normal frequency when co-cultured with control sib- 
lings (Fig. 4e). By contrast, larvae expressing wild-type Wg in imaginal 
discs and NRT-Wg elsewhere we{KO; NRT-Wg”°; WT} (Fig. 4j, 
k) largely failed to eclose when co-cultured with control siblings even 
though they grew into normal-looking adults when grown in separate 
vials (Fig. 41, n, p). These observations suggest that Wg release, or 
signalling activity not achieved by NRT-Wg, is required in tissues 
other than imaginal discs for timely development and general fitness. 
Nevertheless, further analysis of wg{KO; WT. NRT- We?s} flies 
suggested that, in addition, the growth rate of imaginal discs expressing 
NRT-Wg is autonomously compromised. The wings of such mosaic 
flies were smaller than those of controls (—23%; Fig. 4f-h and Extended 
Data Fig. 4d). This constitutes a more extreme size reduction than that 
seen in wings obtained from animals expressing NRT-Wg throughout 
(wg{KO; NRT- Wg}; — 12%). One possible interpretation is that in the 
absence of Wingless release, imaginal disc growth slows down in an 
organ-autonomous manner, preventing a normal size to be reached in 
larvae that develop at a normal rate, as in the case in the wg{KO; 
WT; NRT- Wg?“ genotype. To confirm this possibility, we mea- 
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Figure 4 | Tissue-specific allele switching to assess the contribution of 
Wingless release to organ-autonomous growth rate and organismal 
developmental timing. a, Homozygous wg{KO; NRT-Wg} animals 

(70 animals, 8 experiments) were developmentally delayed (P < 0.001) relative 
to control siblings (wg{KO; NRT-Wg}/Cyo-GFP, 34 animals, 6 experiments). 
Larvae were grown in separate vials. b, Normalized eclosion rate of 
homozygous we{KO; NRT- Wg} larvae cultured with (130 animals, 

4 experiments) or without (90 animals, 6 experiments) control siblings. 

c, d, Wg to NRT-Wg conversion with vestigial-gal4 (vg-G4) UAS-Fip. 
Expected NRT-Ws$ tissue is shown in red. e, Normalized eclosion rate of 
resulting we{KO; WT®°”; NRT-Wg?*“} larvae co-cultured with control 
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Si MIL_™EELE IAI re 


walk; WT8°%;, NAT=WgPies} 


Control 
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siblings (80 animals, 4 experiments). f-i, The wings of wg{KO; wrPew, 
NRT-Wg?“} flies were significantly smaller than those of control siblings 
(—23%, P < 0.001) and had an incomplete margin (missing bristles; red arrow). 
j, k, NRT-Wg to Wg conversion with vestigial-gal4 UAS-Flp. 1, Normalized 
eclosion rate of the resulting we{KO; WT®°*”; NRT-Wg?*"} larvae co-cultured 
with control siblings (>140 animals, 4 experiments) was significantly smaller 
than that of wg{KO; WT?°Y; NRT-Wg?*} larvae (P< 0,001). m-p, The wing 
size and margin of we{KO; WT”; NRT-Wg”"} flies were similar to those 
of sibling controls (P > 0.05). Error bars represent s.d. Statistical significance 
was assessed using Student's f-test. 
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sured disc size in we{KO; WT?°”; NRT-Wg”“}, wg{KO; NRT-Wg}, 
and control animals at the onset of pupariation. The results showed 
that Wg tethering specifically in imaginal discs causes a ~15% size 
deficit (Extended Data Fig. 4e-g), a relatively mild reduction consider- 
ing that growth occurs over a period of about 4 days. Another note- 
worthy feature of we{KO; WT; NRT-Wg?**“} flies is that their wing 
margin lacked occasional sensory bristles (Fig. 4i), a phenotype not 
seen in homozygous wg{KO; NRT- Wg} flies grown in optimal condi- 
tions (Extended Data Fig. 1j-l). As margin specification occurs at the 
very end of larval life, we suggest that in we{KO; WT?°®; NRT- 
Wg} flies, the margin cannot be completed for lack of time before 
pupariation (see Methods for another example of wing-body asyn- 
chrony). The above considerations suggest that a disc-autonomous 
growth delay does not hold up developmental progression of the 
organism and therefore that, at least in this instance, there is no 
organ-size control checkpoint. Moreover, our mosaic analysis by allele 
switching also shows that W¢ release, or a level of signalling activity not 
achieved by NRT-Wg (see Methods for further discussion on their 
relative importance), contributes independently to ensuring timely 
growth in an organ-specific manner and to pacing organismal devel- 
opmental progress. 


Conclusion 


In this paper we have asked directly to what extent the spread of Wg, 
the main Wnt of Drosophila, is required for normal development of 
the wing. Transcription of wg takes place throughout wing precursors 
at early stages, and later becomes restricted to a narrow stripe at the 
dorsoventral boundary. From there, the protein product spreads to form 
a gradient****, as expected from a morphogen. However, the require- 
ment for graded expression has been questioned”". In addition, as we 
showed here, juxtacrine Wg signalling suffices for extensive growth 
and patterning. Early wg expression throughout the wing primordium, 
along with the persistent effect of signalling’’*°*, ensure continued 
target-gene expression in the absence of W¢ release. We infer that these 
processes contribute substantially to patterning and growth in the wild 
type. Nevertheless, target genes remain responsive until late stages*”. 
Therefore, Wg spreading from the dorsoventral boundary is likely to 
boost proliferation, at least in cells within its reach. Our results, along 
with those of previous studies'*”'**-*’, suggest that the requirement for 
long-range spreading of other Wnts should be revisited. Furthermore, 
our genome-editing approach and associated tools for mosaic analysis 
provide a template for further investigation of other signalling proteins. 


METHODS SUMMARY 


All the standard fly strains used in this study are described at http://www .flybase.org 
and all experiments were conducted at 25°C unless otherwise indicated. UAS-HRP- 
CD8-GFP encodes a transmembrane protein comprising extracellular HRP, the 
transmembrane domain of mouse CD8 and intracellular GFP (details upon request). 
Gene targeting was performed with a new vector and protocol described elsewhere™*. 
The key elements of the targeting vector are shown in Extended Data Fig. la. The 
homology arms used for targeting of the wg locus were amplified by PCR from a 
BAC (http://www.pacmanfly.org) with primers listed in Methods. The mini-white 
marker of the targeting vector was used for initial identification of candidate targeted 
flies. True recombinants were confirmed by several criteria, including expression 
of Cherry, the phenotype of homozygous animals and PCR (primers and protocols 
described in Methods). Various constructs were integrated into the attP site of the 
targeted allele with a variety of ‘reintegration vectors (RIVs)’ using mini-white or 
pax-Cherry as a marker’. Both markers were used indistinctly for wg{KO; NRT- 
Wg}. Immunostaining of imaginal discs was performed according to standard pro- 
tocols with the following primary antibodies: rabbit anti-HA (1:1,000, Cell Signalling 
Technology, C29F4), mouse anti-Wingless (1:100, Hybridoma Bank), rabbit anti- 
vestigial (1:50, gift from S. Carroll), mouse anti-Distalless (1:300, gift from S. Carroll), 
rabbit anti-Senseless (1:500, gift from H. Bellen), and chicken anti-B-galactosidase 
(1/200, Abcam). In all micrographs, blue staining shows DAPI, a nuclear marker. 
FISH was performed according to standard protocols**. Temperature shifts were 
performed by transferring culture vials between incubators maintained at the 
desired temperatures. Details on measuring developmental timing are provided 
in Methods. For all quantitative measurements, the error bars represent standard 


184 | NATURE | VOL 505 | 9 JANUARY 2014 


deviation. Samples were normally distributed. Sample size was chosen to ensure 
statistical significance, which was assessed by Student’s t-test using Prism 6. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 


Editing the wingless locus. Gene targeting was performed with a targeting vector 
and according to protocols described elsewhere*. The primers used to amplify 
homology arms and to confirm gene targeting are the following: forward 5’ arm, 
5'-GATCAGTGCGGCCGCGCCGAGAAGAGATCGCCACCACCACTCTACT 
CTTTTGCACATGCC-3’; reverse 5’ arm, 5’-GATCGCTAGCCGGCACACACA 
CTCTCACACTGACACACGGGGTATGATAGATACTTCC-3’; forward 3’ arm, 
5'-GATCATGCATGGACACTGCCCGCCTCCAGCCCAGTCCCGTCCTCTGA 
AGCCGCCC-3’; reverse 3’ arm, 5’-GATCCCTAGGGCCGATCTGTTGCAATT 
TCCAAATCAAACAGCGCGCGAAACGTGTGGC-3’; for confirmation of 5’ 
recombination (genomic, forward), 5’-CAGCACTAAAATGGCTTCCTCCGC-3’; 
for confirmation of 5’ recombination (vector, reverse), 5’-CAACTGAGAGAAC 
TCAAAGG-3’ (within attP site); for confirmation of 3’ recombination (vector, 
forward), 5'-TCGTATAATGTATGCTATACG-3’ (within Gal4 Poly A); for con- 
firmation of 3’ recombination (genomic, reverse), 5‘-GTTCCCGGAATAGTTTA 
GACCTC-3’. 

After confirmation of targeting into wingless, much of the targeting vector was 
removed by crossing to a strain expressing Cre constitutively (Bloomington stock 
851) (procedure outlined in ref. 24). The resulting strain, referred to as wg{KO} in 
the manuscript, was used as a host for reintegration of various constructs via the 
attP site. Reintegration was achieved by injecting a wg{KO} strain expressing the 
PhiC31 integrase. 

The following reintegration vectors (RIV) were generated: RIV{wg, mini-white} 
(Extended Data Fig. 1d), the full-length wg cDNA, along with 135 bp of 5’ untrans- 
lated region (UTR) and 1,200 bp of 3’ UTR was inserted as a Notl-Ascl fragment 
into RIV{MCS; mini-white} (ref. 24); RIV{ERT Wg FRT QF; pax-Cherry} (Fig. 3b), 
this was obtained by cloning the above NotI-Ascl wg fragment into RIV{FRT- 
MCS2-FRT QF; pax-Cherry}, which is described in ref. 24; RIV{NRT- Wg; mini- 
white} (Fig. la-c and Extended Data Fig. le), pMT-NRT-HA-Wg was made by 
replacing the NotI-BglII fragment of pMT-Wg with a synthetic fragment (GENEWIZ) 
containing, from 5’ to 3’, the 5’ UTR of wg, the NRT open reading frame, DNA 
encoding two HA epitopes, and remaining wg coding sequences (up to the BglII), 
then DNA encoding NRT-HA-Wg¢ was transferred to RIV{MCS; mini-white} as 
a NotI-AsclI fragment; RIV{FRT NRT-Wg FRT QF; pax-Cherry} (Fig. li) was 
obtained by cloning the above NotI-AscI NRT-HA-Wingless fragment into RIV{FRT- 
MCS2-FRT QF; pax-Cherry}, which is described in**; RIV{Gal4; mini-white} 
(Fig. 3a and Extended Data Fig. 3c) is described elsewhere”; RIV{FRT Wg FRT 
NRT-Wg; pax-Cherry} (Fig. 4c and Extended Data Fig. 4b), first, the QF open 
reading frame was excised from RIV{FRT Wg FRT QF; pax-Cherry}with AvrII and 
Agel and replaced by an Xbal-Agel fragment containing NRT-HA-Wg obtained 
from pMT-NRT-HA-Wg; RIV{FRT NRT-Wg FRT Wg; pax-Cherry} (Fig. 4j), an 
Xbal-Agel fragment encoding Wingless was obtained from pMT-Wg¢ and cloned 
into RIV{FRT NRT-Wg FRT QF; pax-Cherry} pre-digested with AvrII and Agel. 
Immunostaining, FISH and microscopy. The following primary antibodies were 
used: rabbit anti-HA (1:1,000, Cell Signalling Technology, C29F4), mouse anti- Wingless 
(1:100, Hybridoma Bank), rabbit anti-Vestigial (1:50, gift from S. Carroll), mouse 
anti-Distalless (1:300, gift from S. Carroll), Rabbit anti-Senseless (1:500, gift from 
H. Bellen), and chicken anti-B-galactosidase (1/200, Abcam). Secondary antibodies 
labelled with Alexa 488 or Alexa 555 (used at 1:200) were obtained from Molecular 
Probes. Imaginal discs were mounted in Vectashield with DAPI (Vector Laboratories). 
In all micrographs, blue staining shows DAPI, a nuclear marker. Staining for HRP 
activity with DAB was performed as previously described’’. FISH was performed 
according to standard protocols**. Fluorescence micrographs were acquired with a 
Leica SP5 confocal microscope. Embryo cuticles were prepared according to stand- 
ard protocol. Bright field images from embryo cuticles were obtained with a Zeiss 
Axiophot2 microscope with an Axiocam HRC camera. Bright field and confocal 
images were processed with Photoshop CS4 (Adobe). Measurements of adult wing 
size, cell density and generation of fluorescence intensity profiles in wing imaginal 
discs were carried out as described previously”’. 

Temperature shifts to assess Neurotactin- Wingless perdurance and wingless 
promoter activity. The perdurance of NRT-Wg was estimated in larvae of the 
genotype we{FRT NRT-Wg FRT QF}/wg{KO}; hedgehog-gal4 tubulin-gal80'*/UAS- 
Fip (Fig. 1i). Larvae of this genotype were kept at 18 °C until late-third instar. They 
were then transferred to 29 °C for 12 h to allow excision of NRT-Wg. Under these 
conditions, excision occurred in groups of cells without any apparent spatial 
reproducibility. To assess the activity of wingless promoter, wg{KO; Gal4}, tubulin- 
gal80'/Cyo flies were crossed to flies carrying UAS-Flp (Bloomington stock number 
4539) and Actin FRT stop FRT LacZ (Bloomington stock number 6355). The pro- 
geny were allowed to lay eggs for 24 h at 18°C. The resulting larvae were kept at 
18 °C and then transferred to 29 °C at the indicated stages. Larvae were allowed to 


continue development at this temperature until late wandering stage, before 
puparation. The timeline of this protocol is illustrated in Extended Data Fig. 3b. 
In both experiments imaginal discs were then processed for conventional 
immunofluorescence. 
Assessing the size of wingless mutant territories. wg mutant territories were 
generated with hedgehog (hh)-gal4 and UAS-flp, which trigger mitotic recom- 
bination in virtually all cells of the P compartment at early stages of larval develop- 
ment. In control experiments”, genetically marked wild-type tissue generated in a 
wild-type background occupied half of the posterior compartment. Unlike stand- 
ard heat shock-induced clonal analysis, this approach overcomes the issues raised 
by differences in developmental timing between controls and experimentals. The 
genotypes used to generate the data shown in Extended Data Fig. 2c-f were: FRT40 
we*“/FRT40 ubiquitin-GFP; hh-gal4, UAS-Flp (Extended Data Fig. 2c) FRT40 
we*“/FRT40 ubiquitin-GFP NRT- Wg; hh-gal4, UAS-Flp (Extended Data Fig. 2d). 
For both genotypes, larvae were dissected and fixed at the end of the third 
instar, before pupariation. At least 30 confocal planes were combined by maximal 
projection. The areas of GFP-positive and GFP-negative territories were mea- 
sured in the prospective wing blade (area encircled by white dots in Extended 
Data Fig. 2e) with ImageJ. Over 10 discs were analysed for each genotype. 
Normalized eclosion rates. For co-culture assays, eggs deposited by balanced flies 
were allowed to develop in a single vial and all the eclosed adults were genotyped 
(using the dominant marker on the balancer chromosome). The number of experi- 
mental (non-balancer) was normalized to the Mendelian ratio-adjusted number of 
sibling controls (balancer). For example, eggs deposited by wg{KO; NRT- Wg}/Cyo 
gave rise to homozygous wg{KO; NRT-Wg} (experimental) and wg{KO; NRT- 
We}/Cyo (control) flies. As these are expected in a 1:2 ratio, the normalized eclo- 
sion rate was calculated as (no. of experimental animals)/(0.5 X no. of controls). 
Note that the normalized eclosion rate can be larger then 100% as balancer chro- 
mosomes are likely to reduce fitness. For separate culture experiments, first instar 
larvae (24-28 h after egg laying) were genotyped using GFP-expressing balancer 
chromosomes. Fifteen larvae of each genotype were then transferred into separate 
vials and kept at 25 °C under normal laboratory conditions. The number of eclosed 
experimental flies was normalized to the number of eclosed control. 
Time of puparation. First instar larvae (24-28 h after egg laying) were genotyped 
using GFP-expressing balancer chromosomes. Ten to fifteen larvae of each geno- 
type were transferred into separate vials and kept at 25 °C under normal labor- 
atory conditions until they started puparation. The number of larvae pupariating 
within a 2-h interval was then recorded. At least four independent experiments 
were performed for each genotype (Fig. 4a). 
Genetic markers of reintegration. Depending on experimental requirements, 
two genetic markers, mini-white and pax-Cherry, were used alternately to track 
reintegrated genetic elements (as shown for wg{KO; NRT- Wg} in Extended Data 
Fig. le). We found that the presence of mini-white and pax-Cherry downstream 
of the reintegrated cDNA had an equally detrimental impact on quantifiable 
physiological functions such as developmental progression (see below), and orga- 
nismal motility (not shown). Although the choice of marker is unlikely to be 
relevant, marker usage is listed in Supplementary Notes. 
Removal of the genetic marker from wg{KO; NRT-Ws} preferentially ame- 
liorates developmental timing. In the main text we have referred to wg{KO; 
NRT-Wg; pax-Cherry} and wg{KO; NRT- Wg; mini-white} indistinctly as we{KO; 
NRT-Wsg} as they behaved identically in all assays. In preliminary experiments, 
we assessed the effect of removing the genetic markers (pax-Cherry or mini- 
white) by Cre-mediated excision (see Extended Data Fig. le for position of the 
LoxP sites). We found that homozygous wg{KO; NRT-Wsg; floxed} larvae grew 
faster and competed more effectively in co-culture than homozygous wg{KO; 
NRT-Ws3} larvae (carrying mini-white or pax-Cherry), a phenotypic improve- 
ment that is likely to be due to enhanced NRT-W¢ expression. However, the wing 
margin of homozygous wg{KO; NRT- Wg; floxed} flies were seen occasionally to 
miss sensory bristles. This confirms our suggestion that a minor extension of the 
growth period helps NRT-W¢ flies to make a perfectly patterned wing margin. 
Our evidence so far is consistent with the notion that the developmental delay of 
wg{KO; NRT- Wg} animals could be due mostly to a mild reduction in expression 
and hence activity. It remains to be determined whether the wing autonomous 
growth defect is due to the lack of Wg release or reduced expression. 
Sample collection. Animals for analysis were chosen randomly from a large 
collection at specific developmental age as required. When necessary, a given 
sex was studied as specified. 
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Extended Data Figure 1 | Engineering the wg locus to express membrane- 
tethered Wg. a, Structure of the wingless locus before targeting, after targeting, 
and after Cre-mediated excision. The wg{KO} allele was used as a founder line 
for subsequent reintegration. b, Cherry expression in wing, leg and haltere 
imaginal discs of larvae carrying one copy of wg{KO; Cherry}. c, Cuticle 
preparation of a homozygous wg{KO}larva at low and high magnification 
(black arrow). The phenotype is identical to that of wg* homozygous 
embryos”. d, Diagram showing the reintegration of a wild type wingless cDNA 
in the wg{KO} to generate wg{KO; Wg} (note presence of mini-white). 

e, Diagram showing the reintegration of the NRT-Wg cDNA in wg{KO}. This 
was achieved using either pax-Cherry or mini-white as a genetic marker, as 
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indicated. f, Wing of a wild-type fly. g, Wing of a wg{KO; Wg} homozygous fly. 
h, Overlay of the wings shown in Fig. 1b to illustrate the mild wing size 
reduction in NRT-Wsg flies. i, Wing size of wg{KO; NRT-Wg} homozygous 
(n = 14) and control (wg{KO; NRT-Wg}/GlaBC) flies (n = 16, ***P < 0.001). 
j-l, High-magnification view of the wing margin of wild-type, homozygous 
we{KO; Wg}, and homozygous we{KO; NRT- Wg}. They are barely 
distinguishable. m-o, Views of the dorsal thorax illustrate the normal 
arrangement of pattern elements such as microchaetes and macrochaetes in the 
genotypes indicated. Error bars represent s.d. Statistical significance was 
assessed using Student's f-test. 


©2014 Macmillan Publishers Limited. All rights reserved 


AY ARTICLE 


WT Background 


_WT Backgrottde 


% of wg area 


WT NRT-Wg 
Background Background 


Extended Data Figure 2 | Senseless expression and growth in wingless-null mutant territory (wg** homozygous; GEP-negative) relative to that of wild 
patches surrounded by wild-type or Neurotactin-Wingless-expressing cells. type (c) or wg{KO; NRT-Wg} homozygous (d) tissue. Wg and NRT-Wg, 

a, b, Expression of Senseless (red) is lost in patches of wingless mutant cells detected with anti-Wg, are shown in red. e, Outline of the territory where the 
(GFP-negative; wg** homozygotes) except in the cells located within one cell _ surface areas were assessed. f, Quantification of the areas colonised by wingless 
diameter of surrounding GFP-positive cells, which are wild-type cells (a) or mutant cells (GFP-negative) in the two genetic backgrounds. On average, the 
homozygous wg{KO; NRT-Wg} (b) (white arrows). Mosaics were created by _ wingless-null territory was smaller in the wg{KO; NRT-Wg} homozygous 
mitotic recombination in a way that generates approximately the same number _ background (n = 24) than in the wild type (n = 20, ***P < 0.001). Error bars 
progenitors for the two genotypes, as described in Methods. c, Example of represent s.d. Statistical significance was assessed using Student's t-test. 
mosaic imaginal discs generated as above to measure the growth of wingless 
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imaginal disc development. a, Timing of key developmental stages at only excised in cells that express wingless at the time of shifting to 29 °C 

25 °C. b, Developmental timing at 18 °C as it relates to the results illustrated to activate Flp expression and hence excision of the stop cassette. Discs were 
in c. c, Permanent labelling of wg-expressing cells and their descendants shifted from 18 °C to 29 °C at different stages (shown in b) but they were 

at different stages of development. Genotype was wg{KO; Gal4}, fixed and stained at the same stage, just before puparation. 
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Extended Data Figure 4| Tissue-specific allele switching to determine the 
anatomical origin of organismal developmental delay in Neurotactin- 
Wingless animals. a, Cumulative pattern of vestigial-gal4 activity in various 
organs precursors. Expression of vestigial-gal4 at any stage or place leads to 
excision of the stop cassette in Actin FRT stop FRT LacZ thus marking 
permanently the corresponding cells. As expected, nearly the whole wing and 
haltere discs were labelled at the end of larval development. In wing imaginal 
discs, only a few cells were B-Galactosidase-negative that did not overlap with 
the domain of Wg expression (anti-Wg, red). In the eye antennal disc, the 
patterns of Wg (white arrowhead) and B-Galactosidase expression are also 
non-overlapping. Therefore, in combination with UAS-Fip, vestigial-gal4 is 
expected to excise an FRT cassette throughout the domain of wingless 
expression. Examination of the brain and CNS shows that vestigial-gal4 is 
unexpectedly active in these tissues. b, In larvae of genotype vestigial-gal4, 
UAS-Flp, wg{FRT Wg FRT NRT-Wg}/Cyo wg, most of the wg-expressing 
cells in leg, haltere and wing imaginal discs, but not in the brain and CNS, 
were converted to expressing NRT-HA-Wg (anti-HA; green). 

c, Developmental timing in wg{KO; WT"°”; NRT-Wg?"} (vestigial-gal4, 
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UAS-Flp, wg{FRT Wg FRT NRT-Ws}) and control (vestigial-gal4, UAS-Flp, 
we{FRT Wg FRT NRT-Wg}/GIaBC) larvae (80 animals, 4 experiments). The 
two data sets cannot be statistically distinguished (P > 0.05). d, Adult wing size 
for three genotypes: wg{KO; WT"; NRT-Wg?*"}/GlaBc obtained from 
selfed vestigial-gal4, UAS-Flp, wg{FRT Wg FRT NRT-We}/GlaBc (n = 16, 
shown in black); wg{KO; WT?°”; NRT-Wg?***}, obtained from homozygous 
vestigial-gal4, UAS-Flp, wg{FRT Wg FRT NRT-Wg} (n = 15, shown in purple); 
and we{KO; NRT-Wg"’”; WT?*“}, obtained from homozygous vestigial-gal4, 
UAS-Flp, wg{FRT NRT-Wg FRT Wg} (n = 13, shown in grey). e-g, Extent of 
Distal-less expression in we{KO; WT"°™; NRT-Wg”**“} heterozygotes (over 
Cyo; e) and homozygotes (f). All the discs were obtained from immobile larvae 
at the time of anterior spiracle eversion, an event that marks the onset of 
pupariation. The extent of the Distal-less domain was estimated from the 
surface area of a polygon drawn around the zone of immunoreactivity, as 
shown. The results, plotted in panel g, show a mild reduction in wg{KO; 
wT’; NRT- We? "S\ discs (n = 13) compared to controls (n = 20; 

***P < 0.001; n.s., not significantly different). Error bars represent s.d. 
Statistical significance was assessed using Student's t-test. 
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The rarity of dust in metal-poor galaxies 
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Galaxies observed at redshift z > 6, when the Universe was less than 
a billion years old, thus far very rarely show evidence’ * of the cold 
dust that accompanies star formation in the local Universe, where 
the dust-to-gas mass ratio is around one per cent. A prototypical 
example is the galaxy Himiko (z = 6.6), which—a mere 840 million 
years after the Big Bang—is forming stars at a rate of 30-100 solar 
masses per year, yielding a mass assembly time of about 150 x 10° 
years. Himiko is thought to have a low fraction (2-3 per cent of the 
Sun’s) of elements heavier than helium (low metallicity), and although 
its gas mass cannot yet be determined its dust-to-stellar mass ratio 
is constrained’ to be less than 0.05 per cent. The local dwarf galaxy 
I Zwicky 18, which has a metallicity about 4 per cent that of the Sun’s* 
and is forming stars less rapidly (assembly time about 1.6 x 10° years) 
than Himiko but still vigorously for its mass’, is also very dust defi- 
cient and is perhaps one of the best analogues of primitive galaxies 
accessible to detailed study. Here we report observations of dust 
emission from I Zw 18, from which we determine its dust mass to be 
450-1,800 solar masses, yielding a dust-to-stellar mass ratio of about 
10° to 10” ° and a dust-to-gas mass ratio of 3.2-13 X 10°. If Zw 18 


PACS 100. um 


Declination offset (”) 


50 40 30 20 10 O -10 -20 
Right ascension offset (”) 


-30 -40 -50 


Figure 1 | The 100-1m and 160-m images of I Zw 18. The colour scale is in 
units of megajanskys per steradian. Shown is the far-infrared detection, at 
A= 100 pm, of dust emission in galaxy I Zw 18, and the marginal detection of 
I Zw 18 at 160 jum. These new observations (proposal identification, 

PID: OT_dbfisher_1) were obtained with the Herschel PACS. White contours 
represent the surface density of atomic hydrogen from the Very Large Array 
map"’. The beam size of the H1 map is 8.8 X 8.3 arcsec. For display purposes, 
the pixel size of the infrared maps is resampled to match the pixel size of 

the H1 map. At 100 um I Zw 18 is clearly detected, and well matched to the 
centre of the HI gas contours. The emission we detect in both far-infrared 


is a reasonable analogue of Himiko, then Himiko’s dust mass must 
be around 50,000 solar masses, a factor of 100 below the current 
upper limit. These numbers are quite uncertain, but if most high-z 
galaxies are more like Himiko than like the very-high-dust-mass 
galaxy SDSS J114816.64 + 525150.3 at z ~ 6, which hosts a quasar®, 
then our prospects for detecting the gas and dust inside such gala- 
xies are much poorer than hitherto anticipated. 

The recent study’ of HFLS3, a ‘maximum starburst’ (that is, a galaxy 
that converts gas into stars at a rates that are close to the theoretical 
limit) at z = 6.3, provides an example of a galaxy with a large amount 
of dust (about a billion solar masses, Mg), and a ratio of dust mass to 
gas mass of 0.01 and a ratio of dust mass to star mass of 0.04, which are 
more like those of nearby starbursting galaxies. This galaxy has an 
astonishing star-formation rate of around 3,000 Mo per year, and con- 
verts its gas into stars at rates 2,000 times that of typical galaxies, which 
are properties rare even for gas-rich high-z galaxies. Frequently, obser- 
vations of dust and molecules in high-z galaxies tend to target bright 
active galaxies*®, such as HFLS3 or J114816.64+525150.3. Such massive 
galaxies are well known to be rare at all redshifts. For those ‘normal’ 
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filters is contained within a small region (15” or 1.3 kpc). We note that the 
off-target peaks in the 160-j1m map are not noise: they are all coincident with 
peaks in the 100-j1m map, and are therefore most probably background targets. 
At 160 tum we detect emission at the 3a level that is consistent with the 

peak of both H1 and the infrared emission at 24 |m, 70 um and 100 um, and 
which we attribute to I Zw 18. The peak and extent of the emission in our 
images is coincident with that of the Ha emission from Hubble Space 
Telescope images’, and also the peak of H1 emission’. The white ovals 
represent the shape and size of the beam of the infrared map. 
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galaxies where deep submillimetre observations have been performed, 
at present only upper limits for both [Cu] and the submillimetre 
continuum exist’*. These observational limits suggest that for the first 
800 million years of galaxy evolution, galaxies with very little dust and 
low metallicity, like Himiko, are more typical. Results from stellar- 
population analyses do indeed show that high-z galaxies have very 
little evidence of dust extinction’. An understanding of the physical 
conditions under which stars form in these primitive systems, however, 
can come only from the study of local analogues. 

Located at a distance of 18 Mpc (ref. 10), I Zw 18 is the archetypal 
star-forming, very low-metallicity* galaxy (12 + log(O/H) = 7.17, or 
1/30 the solar metallicity). I Zw 18 is gas-rich"*"” (atomic hydrogen mass 
of 2.3 X 10° Mo and molecular hydrogen mass of less than 5 X 10’ Mg) 
and actively star-forming” at a rate of (0.05 + 0.02) Mo per year. 
Given its stellar mass (the combined mass of all the stars in the galaxy) 
of 9 X 10” Mo, this galaxy has a very high gas fraction (its gas mass is 
two-thirds of the total mass of its gas and stars). I Zw 18 is currently 
undergoing a starburst phase”, but despite its active star formation no 
CO emission has been detected; such emission would indicate the pre- 
sence of molecular gas’” in I Zw 18. The lowest-metallicity detection of 
CO has recently been reported in the dwarf galaxy Wolf-Lundmark- 
Melotte (WLM)"* (12 + log(O/H) ~ 7.8, or 1/8 the solar value). Unlike 
WLM, I Zw 18 hasa very active star-forming environment, which pro- 
bably photodissociates CO. These properties mean that it is among the 
closest analogues to primitive high-z galaxies, although I Zw 18 contains a 
larger population of evolved stars than may be found in the early Universe. 

We note that the difference in stellar mass between I Zw 18 (or any 
local low-metallicity galaxy) and observed high-z galaxies is large. 
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Figure 2 | The far infrared spectral energy distribution of I Zw 18. Blue 
squares represent fluxes measured by the Infrared Array Camera (IRAC) and 
the Multiband Imaging Photometer (MIPS) onboard the Spitzer Space 
Telescope’*"’. The blue line is a smoothed Spitzer Infrared Spectrograph 
spectrum“. The open triangles are upper limits from the Herschel Spectral and 
Photometric Imaging Receiver (SPIRE) instrument. The red circles show our 
new Herschel PACS data at 100 um and 160 um. Error bars represent 1a 
uncertainties. We fitted the spectral energy distribution with models based ona 
mixture of carbonaceous and silicate dust grains’, which assume a distribution 
of grain sizes matching that of the Milky Way. This is a commonly adopted dust 
model, allowing direct comparison to dust masses in the literature, and is not 
expected to introduce large errors in the dust mass estimate’. Most of the 
modelled dust is heated by a single starlight intensity Uin, but a fraction is 
heated by a power-law distribution of intensities, with an adjustable slope « and 
upper cut-off of Ujax. The best-fitting dust model to the spectral energy 
distribution of I Zw 18, shown here, returns the following values: « = 2.4, 
Umin = 100, and <U> ~ 200, for Umax = 10’ and no polycyclic aromatic 
hydrocarbons, with U in units of the radiation field in the vicinity of the Sun. 
The solid grey line represents the best-fitting dust model’’. The dotted line 
represents a cold (Tgust = 20 K) dust model with a mass of 1,000 Mo anda xk 
value that is proportional to v’”. The dashed line represents the linear 
combination of the cold component and the dust model. We find that including 
the cold component of dust increases the discrepancy between the model and 
the 160-jm flux to 2.36. Our assumption of a factor-of-two uncertainty in the 
dust mass, therefore accounts for the possibility of the cold component. 
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Because of the smaller potential well, identical starbursts can in prin- 
ciple more easily drive dust and metal-enriched gas out of I Zw 18 than 
out of Himiko, for example. Nonetheless, I Zw 18 and galaxies like it 
remain our best candidates for the study of metal-poor, starbursting 
environments. 

Using the Photodetector Array Camera and Spectrometer (PACS) 
for the Herschel Space Observatory, we measured the flux of I Zw 18 to 
be 21.1 mJy at wavelength 2 = 100 um, with an uncertainty in flux 
(calculated by placing apertures randomly in the map) of +2.9 mJy 
(signal-to-noise ratio, S/N ~ 7) and a calibration uncertainty of 10% 
(+2.1 mJy). Recent results find a 100-j1m flux that is consistent with 
our result’®. 160 um we measured a flux of 5.6 mJy, with a flux uncer- 
tainty in the map of +1.3 mJy (S/N ~ 4) and a calibration uncertainty 
of +0.6 mJy. (Maps are shown in Fig. 1 and our procedure is discussed 
in the Supplementary Information.) Together with these detections, 
we used a number of ancillary data sources to construct a full infrared 
spectral energy distribution for modelling the dust and star formation. 
Weuse data from the Spitzer Space Telescope’*”’ covering 3.6 jum, 4.5 um, 
5.8 jum, 8.0 Lum, 24 um and 70 tm, as well as a spectrum from the Spitzer 
Infrared Spectrograph"* (see Supplementary Information). 

The dust mass we determined from models (shown in Fig. 2) with 
mixed dust grain temperatures is 9127332 Mo (see Supplementary 
Information for a complete discussion of the uncertainties). Modi- 
fied blackbody models are also usually used for fitting dust spectral 
energy distributions, although they yield unrealistically low dust masses 
because of the assumption of a single temperature. For comparison, a 
modified blackbody model with F, « v'°B,(Taust)—Wwhere F, is the 
flux, v is the frequency, B is the brightness of a blackbody function 
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Figure 3 | The dust-to-gas ratio of galaxies compared to metallicity for local 
galaxies and I Zw 18. The local galaxy sample consists of dust-to-gas ratios 
from two recent papers”; the sample is representative of typical disk and 
dwarf galaxies in the local Universe. For some galaxies in the local sample, flux 
measurements that allow for the gas-mass determination of either H1 or 

H; are not available in the literature. In this case we estimate the total gas 
mass with an empirical correlation relating molecular to atomic gas: 

M(H;) = 0.008M(H1)!*. Those galaxies are plotted as open circles. We also 
include two nearby, well known, low-metallicity galaxies: WLM" (purple 
circle) and the Small Magellanic Cloud” (blue circle). The error bars in the 
upper left corner represent the median 2c for the local galaxy sample. A 
common assumption is that dust-to-gas ratio in galaxies scales linearly with 
metallicity'®. We therefore show a linear bisecting line set to match the local 
galaxy sample; the dashed lines represent the +2¢ root mean squared scatter 
around this bisector. I Zw 18 is a clear outlier from this correlation. (The open 
red diamond symbol represents the local dust-to-gas ratio.) Both the Small 
Magellanic Cloud and WLM have a linear relationship between dust-to-gas 
ratio and metallicity; however, we note that, unlike those two galaxies, I Zw 18 is 
a starbursting galaxy. If I Zw 18 is representative of starbursting low-metallicity 
environments, this implies that the dust mass is lower than we may expect in 
primitive galaxies of the early Universe. 
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Figure 4 | Dust mass versus star-formation rate and stellar mass for local 
disks, high-z starbursts and I Zw 18. Here we compare dust masses to 
star-formation rates (a) and stellar masses (b) of a sample of normal and 
starbursting galaxies”, including I Zw 18. The definitions of symbols are the 
same in both panels. The red diamond represents I Zw 18. Squares represent 
1<z<3 starbursts”’. The blue triangle represents the present upper limit on 
the dust mass of Himiko; the blue diamond shows the dust mass of Himiko if it 
has a similar dust-to-stellar mass ratio to I Zw 18. Solid lines represent linear 
relationships matching the local galaxies; the dashed lines represent 1/100 
(short dashes) and 1/1,000 (long dashes) of the local galaxy sample. The error 
bars in both panels represent the median 2c error bar for the sample. We note 


representing the dust of temperature Tyy—and mass emissivity 
Koo9 = 6.37 cm? g! reproduces the flux at 70um and 160,1m for 
Taust = 70 Kanda dust mass of about 250 Mg. We find that a significant 
mass of cold dust with Taust > 15 K would be detected by our 160-um 
flux (see Supplementary Information). Independently of the assumed 
model, the dust mass necessary to explain the spectral energy distri- 
bution observed in I Zw 18 is extremely low. 

Under simplistic model assumptions, decreasing the amount of 
heavy elements, of which the dust grains are formed, results in a pro- 
portional decrease in the dust-to-gas ratio. This scenario is frequently 
assumed in cosmological models of star formation'*®. Observational 
constraints for this relationship in the early Universe are scarce. I Zw 18 
provides a probe of such environments, and our measurements directly 
constrain the relationship between dust-to-gas ratio and metallicity at very 
low metallicities!*!°”° (12 + log(O/H) = 8). We find that I Zw 18 falls 
roughly two orders of magnitude below the linear correlation between 
metallicity and dust-to-gas ratio (Fig. 3). The distance to the linear relation 
is very significant, approximately four times larger than the spread in the 
data, and much larger than the error bars on the measurement. Using 
dust-to-gas ratios measured only in the infrared-emitting region (the 
central 15 arcsec) typically results in a linear relationship even at low 
metallicity’, 12 + log(O/H) ~ 8, but not in the case of I Zw 18, for which 
the local (that is, measured only in the infrared-emitting region) dust-to- 
gas mass ratio is still a factor of 38 below the linear relationship. 

I Zw 18 stands out in the local galaxy population because it has an 
environment that is both starbursting and also lacking heavy elements. 
Our results suggest that in starbursting galaxies with very low metalli- 
cities the dust-to-gas ratio is determined by more than just the avail- 
ability of heavy elements. The essentially dust-free character of nascent 
galaxies’? like Himiko therefore probably reflects a combination of 
low metallicity and the balance of the dust production-and-destruction 
mechanisms in a starburst environment, which act together to keep the 
dust-to-gas ratio very low in these galaxies. 

These are properties that are common at very high redshift (z>6), 
and consequently we expect those primitive galaxies to exhibit very low 
dust masses compared to their star-formation rates and stellar masses. 
The ratio of dust mass to star-formation rate (Fig. 4) in 1 Zw 18 is more 
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that very significant differences exist between I Zw 18 and the other 
starbursting systems shown here; the distant starbursts have star-formation 
rates five orders of magnitude higher than I Zw 18, and many of them have solar 
metallicity”®. I Zw 18 is clearly an extreme outlier towards lower ratios of dust to 
star-formation rate and lower ratios of dust to stellar mass when compared 
with typical nearby galaxies, and the dust mass is even lower per unit of 
star-formation rate than for starbursting galaxies. As indicated by the extreme 
difference in the current upper limit of Himiko, and its predicted dust mass 
using the ratio of dust mass to star mass of I Zw 18, observations of the highest-z 
galaxies may be significantly overestimating the dust mass. 


than two orders of magnitude lower than typical in local galaxies, and 
an order of magnitude lower than observed in the z = 2-3 starbursts. 
Even when normalized by its stellar mass or star-formation rate, the 
dust mass of I Zw 18 is extremely small compared with both local and 
z = 2-3 galaxies. In Fig. 4 we show that if one assumes that Himiko has 
the same dust-to-gas ratio as I Zw 18, rather than using the upper limit 
on the 1.2-mm flux’, this would significantly affect the amount of dust 
relative to the gas mass and star mass in high-z galaxies. With consid- 
erable uncertainty, we can scale the stellar mass of Himiko with the 
dust-to-stellar-mass ratio of I Zw 18, and place it on Fig. 4 with a dust 
mass of about 50,000 Mo. 

If the dust temperature is 40 K (ref. 3), we calculate that Himiko 
would have a flux density of 0.5 py at Earth at 260 GHz, which would 
require several tens of days of integration with the complete Atacama 
Large Millimeter Array (ALMA) to detect. Maximum starbursts like 
HFLS3 are very rare, even in the early Universe’ (about one per cubic 
gigaparsec), whereas blue dust-poor ‘drop-out’ (so highly redshifted 
that they can only be seen in the red filters) galaxies are much more 
common at high redshifts’ (one in a thousand per cubic megaparsec). 
This implies that the interstellar medium of I Zw18 may indeed be 
representative of the primitive galaxy population in the early Universe. 
If this is the case, the prospects for detecting dust emission at z > 6 will 
probably be limited to unusually evolved sources like HFLS3 and SDSS 
J114816.64 + 525150.3. 
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Face-to-face transfer of wafer-scale graphene films 


Libo Gao!?, Guang- Xin Nib?) Yanpeng Liu’, Bo Liu?, Antonio H. Castro Neto’? & Kian Ping Loh!? 


Graphene has attracted worldwide interest since its experimental 
discovery'”, but the preparation of large-area, continuous graphene 
film on SiO,/Si wafers, free from growth-related morphological 
defects or transfer-induced cracks and folds, remains a formidable 
challenge’. Growth of graphene by chemical vapour deposition on 
Cu foils*” has emerged as a powerful technique owing to its com- 
patibility with industrial-scale roll-to-roll technology’. However, the 
polycrystalline nature and microscopic roughness of Cu foils means 
that such roll-to-roll transferred films are not devoid of cracks and 
folds®’. High-fidelity transfer or direct growth of high-quality gra- 
phene films on arbitrary substrates is needed to enable wide-ranging 
applications in photonics or electronics, which include devices such 
as optoelectronic modulators, transistors, on-chip biosensors and 
tunnelling barriers**’. The direct growth of graphene film on an 
insulating substrate, such as a SiO2/Si wafer, would be useful for this 
purpose, but current research efforts remain grounded at the proof- 
of-concept stage, where only discontinuous, nanometre-sized islands 
can be obtained’. Here we develop a face-to-face transfer method 
for wafer-scale graphene films that is so far the only known way to 
accomplish both the growth and transfer steps on one wafer. This 
spontaneous transfer method relies on nascent gas bubbles and 
capillary bridges between the graphene film and the underlying 
substrate during etching of the metal catalyst, which is analogous 
to the method used by tree frogs to remain attached to submerged 
leaves’’”. In contrast to the previous wet**’* or dry®”’ transfer 
results, the face-to-face transfer does not have to be done by hand 
and is compatible with any size and shape of substrate; this approach 
also enjoys the benefit of a much reduced density of transfer defects 
compared with the conventional transfer method. Most impor- 
tantly, the direct growth and spontaneous attachment of graphene 
on the underlying substrate is amenable to batch processing in a 
semiconductor production line, and thus will speed up the techno- 
logical application of graphene. 

Much effort has been directed to controlling the grain size’*’*, doping” 
and heterostructure’® of graphene by fine-tuning growth conditions 
during chemical vapour deposition (CVD), but there has been a lack of 
breakthroughs in the after-growth transfer process of the graphene film’. 
Conventional transfer methods for CVD graphene can be classified as 
either dry®’ or wet**'*"* transfer, depending on the environment in 
which graphene touches the target substrate. The dry transfer method 
seems to be more applicable to industrial applications, because it can 
realize a 30-inch graphene film on a flexible substrate®, but plenty of 
transfer defects occur—for example, cracks, folds and wrinkles®’. The 
wet transfer method is generally difficult to scale up, and the surface 
tension experienced by the floating graphene at the air—water interface 
causes warping, rippling and rolling of the films during transfer. 

The inspiration for our face-to-face method is found in the study of 
how the feet of a terrestrial beetle or tree frog remain attached to a fully 
submerged leaf”. Microscopic observations reveal that air bubbles 
that are trapped around the feet of the beetle form capillary bridges and 
keep the beetle’s feet attached to the submerged leaf. In a similar fashion, 
the formation of capillary bridges, as indicated schematically in Fig. 1a, 
ensures that the graphene film remains attached to the substrate and 


does not undergo delamination during the etching process (Fig. 1b). 
During the etching of the copper film between the graphene and the 
SiO,/Si substrate, dissolution of the copper generates voids and chan- 
nels, and these create capillary forces that allow the liquid etchant to 
infiltrate between the graphene film and the substrate. In the case of 
hydrophobic graphene surfaces”, the unfavourable interactions between 
the water molecules and the soft graphene film induce instability of the 
planar interface, which leads to fluctuations of the water interface and 
spontaneous cavitation. Strong negative pressure will operate in short 
capillary bridges, resulting in long-ranged attraction between the con- 
tacting surfaces. In fact, such capillary adhesion forces can be larger 
than the van der Waals interaction between the two solids. We suggest 
that the evolution of bubbles during the etching process contributes to 
the formation of capillary bridges between the graphene-substrate inter- 
faces, thus allowing the graphene film to remain attached to the sub- 
strate even with the infiltration of liquid. However, gas bubbles can 
also generate buoyancy forces that will separate the graphene film 
from the underlying substrate, and this is in fact the situation exploited 
in the conventional float transfer process. Therefore, the fate of the 
graphene film—with regard to either delamination from, or adhesion 
to, the substrate—is decided by whether a sufficient number of capil- 
lary bridges can be formed between the graphene and the substrate to 
counteract the pull-off forces due to buoyancy. 

The presence of gas bubbles under the graphene can be seen by atomic 
force microscopy (AFM) imaging of the surface after CVD growth. 
Inductively coupled plasma CVD (ICP-CVD), required for the growth 
of graphene (see Methods), produces energetic ions that can be absorbed 
by the metal catalyst at high temperature. As the temperature is reduced 
at the end of the growth process, some gas precipitates at the surface 
and gets trapped between the impermeable graphene” and the sub- 
strate, forming bubbles. These bubbles alter the adhesion and strain 
properties of graphene, which are manifested as bright spots in the 
AFM phase contrast images (Supplementary Fig. 2). To facilitate the 
formation of capillary bridges, a pre-treatment step involving plasma 
nitridation of the SiO,/Si wafer is helpful. Our experiments showed 
that without this treatment step, the graphene film delaminates at the 
end of the etching process. The nitrogen plasma treatment converts 
the top several nanometres of the surface to silicon oxynitride phases”, 
and these decompose readily during the CVD process at greater tem- 
perature and act as an additional source of bubbles under the graphene. 
After the plasma nitridation of the substrate, Cu catalyst is sputtered 
onto the surface and growth of graphene is performed by ICP-CVD. 
The graphene/Cu/SiO,/Si wafer is then coated with polymethyl! metha- 
crylate (PMMA) for protection and immersed in an aqueous etchant 
bath. The Cu film is then etched (Supplementary Figs 3 and 4). Through- 
out, the PMMA/graphene film adheres firmly to the underlying wafer. 
It is worth pointing out that our face-to-face transfer process has a 
shorter Cu etching time than does conventional float transfer (Fig. 2a). 
The Cu film can usually be completely dissolved within 2h, even in 
dilute etchant. 

To estimate the thickness of the infiltrated water layer, we produced 
an 8-inch SiO2/Si wafer onto which a graphene layer had been trans- 
ferred by our face-to-face process; this wafer, with the layer structure 
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Figure 1 | Illustration of our face-to-face method for transferring graphene 
mediated by capillary bridges. a, Schematic illustration showing ‘bubble 
seeding’ by plasma treatment, CVD growth, Cu film etching, formation of 
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Figure 2 | Characterization of intercalated water layer with capillary bridges 
during face-to-face transfer. a, Comparison of etching time between 
face-to-face (F2F) and float transfer. The triangles indicate the time when 
visible holes in the copper film appear, and the squares indicate the time of 
complete etching of the Cu film. b, Photograph of a partially submerged 8-inch 
PMMaA/graphene/water layer/SiO2/Si wafer in water. The blue box shows 

the border between water/PMMA/water and air/PMMA/water interfaces. 

c, Simulated colour change of the PMMA/graphene/water layer/SiO,/Si wafer 
placed in air. Red and green circles indicate sampling positions for data in e. 
The vertical yellow line indicates the relevant data for different thicknesses of 
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capillary bridges and removal of water and PMMA. See text for details. 
b, Schematic illustration showing that in the absence of plasma treatment, 
delamination of the film results. 
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the water layer and a fixed thickness of 150 nm for the PMMA layer, which 
is used here. d, Optical image of PMMA/graphene at the water-air border. 
The red dot identifies the bulk water. Scale bar, 500 um. e, AFM profiles of 
thicknesses of PMMA/graphene with infiltrated water layer, measured in water 
(top) and in air (bottom). f-h, Photographs of PMMA/graphene/Cu (700 nm)/ 
quartz (14cm) immersed in Cu etchant for different times (given in red at 
bottom of panels), showing that the face-to-face transfer also works on quartz. 
i, Left, photograph of face-to-face transferred graphene on quartz after removal 
of PMMA and baking. Right, quartz only. 
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PMMaA/graphene/water layer/SiO./Si, was pulled out partially from water 
for optical characterization (Fig. 2b). The combined effects of reflection 
and refraction of light at the multiple interfaces can be simulated using 
the CIE (International Commission on Illumination) colour-matching 
equation to reproduce the colour change”*”* (Fig. 2c; the method is 
illustrated in Supplementary Fig. 5). The observed colour change of the 
PMMA/sgraphene/water layer/SiO,/Si system is due to the change in 
thickness of the water layer under the graphene as it escapes, which 
generates coloured fringes (Fig. 2d). The PMMA/graphene floating on 
infiltrated water at the air—water border is pale red in colour (right) at 
first, and changes into olive, green and violet as the thickness of water 
layer decreases. On the basis of the simulated colour changes, the thick- 
ness of the intercalated water layer is estimated to be 60-80 nm in air 
(violet) and 250-300 nm in water (pale red). To confirm this measure- 
ment, the thickness of intercalated water layer when the wafer is immersed 
in water is determined by an AFM that can perform measurements 
under water, and its thickness in air after some water escapes is mea- 
sured by a fast-scan AFM (Supplementary Fig. 6). Figure 2e shows that 
the mean thickness of the water layer when the wafer is under water 
has a mean value of ~250 nm, and that the thickness in air is ~70 nm. 
These values agree with the simulation in Fig. 2c, marked by circles. It 
is worth noting that the change in height profile is reversible when the 
wafer is immersed into water or withdrawn from it again; this implies 
that water can infiltrate freely in the intercalated layer between gra- 
phene and the substrate. The infiltrated water can be completely eva- 
porated by baking the wafer at 150 °C for more than 10 min, producing 
a dry graphene film on SiO,/Si. The face-to-face transfer also works on 
quartz substrates, as shown in Fig. 2f-i for a 700-nm sputtered Cu film 
ona 14cm X 14cm quartz plate. 

The Greenwood-Williamson contact mechanics theory adapted to 
randomly rough surfaces in close contact can be used to estimate the 
maximal height of the capillary bridges formed at the interface**”’. 
PMMA/graphene can be considered an elastic soft surface in contact 
with a hard and rough substrate (Cu/SiO,/Si). In the contact region 
between the two surfaces, a liquid capillary bridge will form. The meniscus 
radius rx is given by the Kelvin equation, rg = —yvo/(kgT)In(Py/Pat)s 
where y is the surface tension, P, is the actual vapour pressure, P.a; is 
the saturated vapour pressure, vp is the molar volume of water, kg is 
Boltzmann’s constant and T is temperature, and the thickness of the 
water is given by dx = rg(cos0, + cos@z), where 0, and 02 are the contact 
angles of the two surfaces”’. 

When the PMMA/graphene is almost delaminating from the wafer 
surface because of a pull-off force, dx becomes largest. At this point, the 
work of adhesion (w; equation (1)) will be decreased to zero, and the 
maximal height of the capillary bridge (d,) can be estimated from 


equation (2)””: 
h _ (qohEdx 
=2y|1———1 1 
id | 2dx o( 2y )| @) 


_ h qohEd, 


Here qp is the roll-off wavevector of the surface roughness power spec- 
trum, h is the roughness and E is the Young’s modulus of the film. 
Accordingly, d. is calculated to be in the range 300-450 nm by assum- 
ing a roughness of h = 100-150 nm for the PMMA/graphene (root mean 
squared roughness as determined by AFM). The significance of d_. is 
that it sets an upper limit on the thickness of the Cu films that can be 
sputtered onto the SiO,/Si surface, because a thick Cu film will result 
in the infiltration ofa thick water layer that exceeds the threshold thick- 
ness d,. As observed in our experiments, if the thickness of the Cu film 
is three times larger than d,, then delamination of the PMMA/graphene 
film occurs as a result of the infiltration of an excessively thick water 
layer. Under typical conditions, the thickness of the infiltrated water 
is about one-third the thickness of the Cu film because hydrostatic 
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pressure compresses the film after the removal of the Cu. Owing to the 
fact that PMMA/graphene is a soft membrane, it can deform elasti- 
cally in response to the capillary forces caused by the gas bridges, and 
this allows the film to be pulled closer to the hard substrate. 

The high surface tension of water exerts a pulling force on the gra- 
phene, and undesirable ripples and folds are created. The surface ten- 
sion can be reduced by adding isopropyl alcohol or by increasing the 
temperature of the water (to 80 °C). Figure 3a is a typical AFM image of 
a face-to-face graphene film on a SiO,/Si wafer, where a high density of 
graphene nanobubbles with heights up to ~20 nm can be seen. After 
the addition of isopropyl alcohol (Fig. 3b) or increasing water tempera- 
ture (Fig. 3c), there is a significant reduction in the density of graphene 
wrinkles, and the height of the nanobubbles is reduced by a factor of at 
least ten, resulting in a visible flattening of the film. Therefore, the addi- 
tion of a surfactant into the water can help to ‘iron out’ the creases on 
the graphene during the face-to-face transfer process. 

Figure 4a shows a photograph of face-to-face transferred graphene 
on SiO,/Si wafers after removing PMMA (both a 4-inch and an 8-inch 
wafer are shown). The transferred graphene film appears to be highly 
uniform, with no visible transfer defects such as cracks and folds, and 
its torn edges can be observed by the naked eye (Fig. 4a inset). Raman 
spectroscopy is a commonly applied technique for characterizing the pro- 
perties of graphene in terms of its layer number, defects and doping”*””. 
The Raman spectra of graphene transferred by face-to-face and, respec- 
tively, conventional float transfer are shown in Fig. 4b, displayed together 
with the spectrum of graphene on a Cu film as a reference. Although 
as-grown graphene before transfer shows nearly no defect-related D 
band (at ~1,350cm ‘), both face-to-face and float-transferred gra- 
phene show small D peaks due to the Raman signal enhancement on a 
dielectric substrate®. The 2D band (at ~2,690 cm '), which is due to 
two phonons with opposite momenta in the highest optical branch, is 
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Figure 3 | AFM images and height profiles of graphene on a SiO,/Si 

wafer transferred by our face-to-face technique. a-c, AFM images of films 
transferred in room-temperature (RT) water (a); in room-temperature 
isopropyl alcohol (IPA; b); and in water at 80 °C (c). d-f, Corresponding height 
profiles for a-c, respectively. The red horizontal line in a-c shows the related 
profile positions for d-f. The density of the nanobubbles decreases after 
reducing the liquid surface tension in b and c. Scale bars, 2 jim. 
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Figure 4 | Characterization of face-to-face transferred graphene on a 
SiO,/Si wafer. a, Photograph of face-to-face transferred graphene on an 8-inch 
wafer and a 4-inch wafer; the boxed area is shown enlarged in the inset, where 
a torn edge shows the presence of a graphene sheet. b, Raman spectra of 
graphene prepared by face-to-face transfer (red) and by float transfer onto 
SiO,/Si substrates (blue), and of graphene on Cu film before transfer (green). 
a.u., arbitrary units. c, Photograph of unbroken 1-m-long and 15-j1m-wide 
zigzag graphene ribbon on face-to-face transferred graphene wafer. Insets at 
top and bottom right are magnified optical images from the positions marked 
by yellow boxes. ‘S’ is a location marker for the metal electrode on the ribbons. 
d, Optical image ofa graphene ribbon 5 cm long and 10 um wide. The graphene 
ribbon is fabricated in a zigzag configuration (top panel, type I) and as a 


more sensitive to long-range order. Here it can be seen that the inten- 
sity of the 2D band is much higher in face-to-face than float-transferred 
graphene, which attests to the higher crystalline quality of the former. 
More characterization of the as-grown graphene is shown in Supplemen- 
tary Figs 8 and 9. X-ray photoelectron spectroscopy is used to check 
the phase purity of face-to-face transferred graphene film on a SiO2/Si 
wafer (Supplementary Fig. 11). There is no detectable Cu on the sur- 
face within the sensitivity of the technique. Residual N can be detected 
on the wafer (at 398 eV) but this is distinct from N dopants in graphene” 
and is attributed to implanted N species in the silicon oxynitride phase”’. 

The electrical properties of face-to-face transferred graphene films 
are investigated by standard micro/nanofabrication processes. To test 
the robustness of this transfer process with regard to crack-free and 
continuous films, long graphene ribbons were fabricated (up to 1m 
long and 15 1m wide in Fig. 4c; 5 cm long and 10 um wide in Fig. 4d; 
details in Supplementary Information). The resistivity is measured by a 
standard four-probe station after applying a 50 nm Ni layer as metal 
electrodes. As shown in Fig. 4e, the measured resistances exhibit linea- 
rity with the length of the ribbon, and a conductivity of ~4,000 S cm! 
can be achieved, thus attesting to the long-distance continuity of the 
film after transfer. To the best of our knowledge, continuous, uninter- 
rupted graphene ribbons with length/width ratios of more than 10° are 
seldom demonstrated, because cracks are inevitable in the wet or dry 
transfer processes; thus, the face-to-face transfer method developed here 
is unique in enabling high-fidelity, crack-free transfer. To evaluate the 
electronic quality of the face-to-face transferred graphene grown by 
ICP-CVD at 750°C, standard electron-beam lithography is used to 
fabricate graphene Hall bars on a Si wafer with a 290-nm SiO, layer. A 


Ribbon length (mm) 


Ving ) 


straight ribbon (bottom panel, type II); inset at top right is the zoomed optical 
image for type I, from the position marked by the yellow box. e, Box plot 
showing resistance versus length for graphene ribbons of type I and type II. 
The linear relationship between resistance and length shows the long-distance 
continuity of the ribbons. All the measurements were performed at a voltage 
(Vps) of 0.2 V between two electrodes. f, Carrier mobility for a monolayer 
graphene Hall bar device under ambient conditions; the Hall effect mobility of 
this device is ~3,800 cm” V_‘s7!. Inset, optical image of the related graphene 
Hall bar device. Vig is the gate voltage during the measurement. Scale bars: 

d (main panel), 2.5 mm; in insets of c, d and f, 1 mm (top right), 20 tm (bottom 
right), 20 tm and 5 jum, respectively. 


typical optical image of these graphene Hall bars is shown Fig. 4f, inset. 
Four metal electrodes of Cr/Au (5/50 nm) are used to eliminate the 
contact resistance, and the transport characteristics of the device are 
measured under ambient conditions. The transport result for this device 
is shown in Fig. 4f, and the extracted carrier mobilities of electrons and 
holes for this device are both ~3,800 cm? V's” ', which is compar- 
able to the properties of thermal CVD graphene grown at much higher 
temperature**’”*°. The slightly negative Dirac point indicates that the 
transferred graphene is weakly n-type doped”. 

All the characterization above shows that the face-to-face transferred 
films maintain good crystalline integrity and long-distance continuity 
without cracks. The copper catalyst can be effectively removed and the 
carrier mobility of the film is comparable to that of thermal CVD- 
grown film prepared at higher temperatures. The key advantages of 
this face-to-face transfer are that it is relatively easy and requires only a 
simple pretreatment step followed by immersion in a suitable etchant; 
it resembles a spontaneous transfer process because there is no recovery 
of floating graphene needed; and, most importantly, the non-manual and 
wafer-compatible nature of the method suggests that it is automation- 
compatible and industrially scalable. Interestingly, we have found that 
water has the ability to infiltrate between the graphene and the wafer 
substrate, thus allowing the addition of different surfactants for modi- 
fying the interfacial tension and reducing the corrugations in graphene 
film. Although there are many potential applications of the roll-to-roll 
transfer method® owing to its applicability to flexible devices, it must be 
noted that so far most devices operate on ‘stiff substrates such as sili- 
con, and that a non-manual, batch-processed transfer method serving 
this technology segment is definitely needed. The face-to-face transfer 
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method will be very useful as an enabler for rapidly emerging graphene- 
on-silicon platforms that have shown excellent promise for devices 
such as a gate-controlled Schottky barrier triode device* and an optical 
modulator’. Finally, the face-to-face transfer method should be appli- 
cable to all CVD growth on a metal-catalyst-coated wafer, such as hex- 
agonal BN and transition-metal chalcogenide films. 


METHODS SUMMARY 


The catalyst preparation and ICP-CVD growth are both performed in a custo- 
mized sputter/electron-beam/ICP-CVD cluster. First, 8-inch or 4-inch SiO,/Si (or 
quartz) wafers are sputtered with Cu films at 100-200 °C. For face-to-face transfer, 
the wafers are pre-treated with N2 plasma (1,000 W, 50 mtorr). The conditions for 
ICP-CVD growth areas follows: the wafer with a sputtered Cu film is treated with a 
Hz; plasma (150 W) for 5 min with the substrate heated to 750°C at 50 mtorr. A 
mixture of H, and CHy (H2:CHy = 150:10) is introduced into the chamber for gra- 
phene growth (150 W plasma power, 50 mtorr, 5 min). After growth, the graphene/ 
Cu/wafer is spin-coated with PMMA (996,000 relative molecular mass, 4 wt% in 
ethyl lactate, 3,000 r.p.m. for 1 min) for protection. A 0.1M ammonium persul- 
phate ((NH4)2S2Og) aqueous solution or 1 M iron chloride (FeCls) is used as the 
etchant. Baking at 150 °C for 10 min is needed to evaporate the water layer. Finally, 
face-to-face transferred graphene on the wafer is realized by removing the PMMA 
with acetone. 

Raman spectra of graphene films are collected using a 532-nm laser under ambient 
conditions (WITec alpha 300 R). The AFM measurements were performed in 
tapping mode (Bruker Dimension FastScan), and a liquid AFM was used to mea- 
sure the thickness of PMMA/graphene in water (Agilent 5420). All coloured optical 
images are captured with a Nikon Eclipse LV100D. Long graphene ribbons are 
fabricated by a laser writer (MicroTech LW405B), and graphene Hall bars are 
patterned by electron beam lithography (Nova NanoSEM 230). Electron-beam- 
evaporated 50-nm Ni films are used as contacts for the long ribbons, and thermally 
evaporated 5-nm Cr and 50-nm Au films are used as metal contacts for the Hall bars. 
Current-voltage curves are measured by a four-probe station under ambient con- 
ditions (Keithley 4200 SCS and 6430). 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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A metal-free organic—inorganic aqueous flow battery 
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As the fraction of electricity generation from intermittent renew- 
able sources—such as solar or wind—grows, the ability to store large 
amounts of electrical energy is of increasing importance. Solid-electrode 
batteries maintain discharge at peak power for far too short a time to 
fully regulate wind or solar power output’. In contrast, flow batteries 
can independently scale the power (electrode area) and energy (arbit- 
rarily large storage volume) components of the system by maintaining 
all of the electro-active species in fluid form*°. Wide-scale utiliza- 
tion of flow batteries is, however, limited by the abundance and cost 
of these materials, particularly those using redox-active metals and 
precious-metal electrocatalysts®’. Here we describe a class of energy 
storage materials that exploits the favourable chemical and electro- 
chemical properties of a family of molecules known as quinones. 
The example we demonstrate is a metal-free flow battery based on the 
redox chemistry of 9,10-anthraquinone-2,7-disulphonic acid (AQDS). 
AQDS undergoes extremely rapid and reversible two-electron two- 
proton reduction on a glassy carbon electrode in sulphuric acid. An 
aqueous flow battery with inexpensive carbon electrodes, combin- 
ing the quinone/hydroquinone couple with the Br,/Br” redox cou- 
ple, yields a peak galvanic power density exceeding 0.6 W cm” at 
1.3 Acm *. Cycling of this quinone-bromide flow battery showed 
>99 per cent storage capacity retention per cycle. The organic anthra- 
quinone species can be synthesized from inexpensive commodity 
chemicals®. This organic approach permits tuning of important prop- 
erties such as the reduction potential and solubility by adding func- 
tional groups: for example, we demonstrate that the addition of two 
hydroxy groups to AQDS increases the open circuit potential of the 
cell by 11% and we describe a pathway for further increases in cell 
voltage. The use of 1-aromatic redox-active organic molecules instead 
of redox-active metals represents a new and promising direction for 
realizing massive electrical energy storage at greatly reduced cost. 

Solutions of AQDS in sulphuric acid (negative side) and Br, in HBr 
(positive side) were pumped through a flow cell as shown schematically 
in Fig. 1a. The quinone-bromide flow battery (QBFB) was constructed 
using a Nafion 212 membrane sandwiched between Toray carbon paper 
electrodes (six stacked on each side) with no catalysts; it is similar to a 
cell described elsewhere (see figure 2 in ref. 7). We report the potential- 
current response (Fig. 1b) and the potential-power relationship (Fig. 1c 
and d) for various states of charge (SOCs; measured with respect to 
the quinone side of the cell). As the SOC increased from 10% to 90%, 
the open-circuit potential increased linearly from 0.69 V to 0.92 V. In 
the galvanic direction, peak power densities were 0.246 W cm” and 
0.600 W cm ° at these same SOCs, respectively (Fig. 1c). To avoid 
significant water splitting in the electrolytic direction, we used a cut- 
off voltage of 1.5 V, at which point the current densities observed 
at 10% and 90% SOCs were —2.25Acm 7and—0.95Acm 7, respect- 
ively, with corresponding power densities of —3.342 Wcm * and 
—1.414Wem *. 

In Fig. 2 we report the results of initial cycling studies for this battery, 
to test for consistent performance over longer timescales. Figure 2a 
shows cycling data at +0.2Acm * using 50% of the total capacity of 


the battery. The cycles are highly reproducible and indicate that current 
efficiencies for the battery are around 95%. Figure 2b shows constant- 
current cycling data, collected at 0.5 Acm  *, using voltage cut-offs of 
0 V and 1.5 V. These tests were done using the identical solutions used 
in the battery for Fig. lb-d. The galvanic discharge capacity retention 
(that is, the number of coulombs extracted in one cycle divided by the 
number of coulombs extracted in the previous cycle) is above 99%, 
indicating the battery is capable of operating with minimal capacity 
fade and suggesting that current efficiencies are actually closer to 99%. 
Full characterization of the current efficiency will require slower cyc- 
ling experiments and chemical characterization of the electrolyte solu- 
tions after extended cycling. 

To gain a better understanding of the quinone half-reaction on 
carbon, AQDS was subjected to half-cell electrochemical measure- 
ments. Cyclic voltammetry of a 1 mM solution of AQDS in 1 M sulphuric 
acid ona glassy carbon disk working electrode shows current peaks cor- 
responding to reduction and oxidation of the anthraquinone species”"" 
(Fig. 3d, solid trace). The peak separation of 34 mV is close to the value 
of 59 mV/n, where n is the number of electrons involved, expected for 
a two-electron process. Rotation of this disk at a variety of rates yields 
mass-transport-limited currents (Fig. 3a) from which the AQDS dif- 
fusion coefficient (D = 3.8(1) X 10 °cm*s’ ') can be determined; through- 
out this paper, the numbers reported in parentheses indicate the 
standard deviation in the last reported digit. Koutecky—Levich analysis 
at low overpotentials (Fig. 3b) can be extrapolated to infinite rotation 
rate and fitted to the Butler-Volmer equation (Extended Data Fig. 3a) to 
give the kinetic reduction rate constant ky = 7.2(5) X 10 *cms ‘. This 
rate constant is greater than that found for other species used in flow 
batteries such as V?*/V*", Br>/Br_ and S$,” /S,” (see table 2 in ref. 3). 
It implies that the voltage loss due to the rate of surface electrochemical 
reactions is negligible. The high rate is apparently due to an outer- 
sphere two-electron reduction into the aromatic m system requiring 
little reorganizational energy. The electrochemical reversibility of the 
two-electron redox reaction was confirmed by fitting the slope to the 
Butler-Volmer equation (Extended Data Fig. 3a), giving the transfer 
coefficient « = 0.474(2), which is close to the value of 0.5 expected for 
an ideally reversible reaction. The Pourbaix diagram (Extended Data 
Fig. 4) confirms that a two-electron, two-proton reduction occurs in 
acidic solution, and yields approximate pK, values of 7 and 11 for the 
reduced AQDS species". 

Functionalization of the anthraquinone backbone with electron- 
donating groups such as hydroxy can be expected lower the reduction 
potential of AQDS (E°), thereby raising the cell voltage’. Hydroxy- 
substituted anthraquinones are synthesized through oxidation reac- 
tions that may be performed at minimal cost. They are also natural 
products that have been extracted for millennia from common sources 
such as rhubarb and could even provide a renewable source for future 
anthraquinone-based electrolyte solutions. 

Quantum chemical calculations of un-substituted and hydroxy- 
substituted AQDS were performed to predict how substitution patterns 
would change both E” of the quinone/hydroquinone couples (Fig. 3c) 
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Figure 1 | Cell schematic and cell performance in galvanic and electrolytic 
modes. a, Cell schematic. Discharge mode is shown; the arrows are reversed 
for electrolytic/charge mode. AQDSH) refers to the reduced form of AQDS. 
b, Cell potential versus current density at five different states of charge 
(SOCs; average of three runs); inset shows the cell open circuit potential 


and the solvation free energy (Go) in aqueous solution (Extended 


Data Table 1). The addition of -OH groups is calculated to lower the 
E° by an average of —50 mV per — OH and provide a wide window for 
tuning E° by almost 0.6 V. In addition, increasing numbers of hydroxy 
substituents are expected to raise the aqueous solubility due to hydro- 
gen bonding. 
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Figure 2 | Cell cycling behaviour. a, Constant-current cycling at 0.2 Acm™ 
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versus SOC with best-fit line superimposed (E., = (0.00268 * SOC) + 0.670; 
R? = 0.998). c, Galvanic power density versus current density for the same 
SOCs. d, Electrolytic power density versus current density. All data here were 
collected at 40 °C using a 3 M HBr + 0.5 M Br; solution on the positive side and 
a 1M AQDS + 1M H,SO, solution on the negative side. 


In confirmation of the theory, the experimental reduction potential 
of 1,8-dihydroxy-9,10-anthraquinone-2,7-disulphonic acid (DHAQDS) 
was found to be 118mV (versus the standard hydrogen electrode), 
which is very close to the 101 mV calculated for this species (Fig. 3c 
and d). The experimental E° of DHAQDS was 95mV lower than 
AQDS, and would result in an 11% increase in QBFB cell potential. 
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at 40 °C using a 3M HBr + 0.5 M Br, solution on the positive side and a 
1M AQDS + 1M H,SO, solution on the negative side (same solution used 
in Fig. 1); discharge capacity retention is indicated for each cycle. 
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Figure 3 | Half-cell measurements and theory calculations. a, Rotating 
disk electrode (RDE) measurements of AQDS using a glassy carbon electrode in 
1M H,SO, at 11 rotation rates ranging from 200 r.p.m. (red) to 3,600 r.p.m. 
(black). b, Koutecky-Levich plot (current! versus rotation rate '/”) derived 
from a at seven different AQDS reduction overpotentials, 7. c, Calculated 


DHAQDS was also found to have faster reduction kinetics (kg = 1.56(5) 
x 10 *cms° '), possibly due to intramolecular hydrogen bonding of 
the —OH to the ketone (Extended Data Fig. 3b). 

The organic approach liberates battery redox chemistry from the con- 
straints of the limited number of elemental redox couples of the peri- 
odic table. Although quinones have been used previously in batteries 
using redox-active solids'*"'’, their incorporation into all-liquid flow 
batteries offers the following advantages over current flow-battery tech- 
nologies. First, scalability: AQDS contains only the Earth-abundant atoms 
carbon, sulphur, hydrogen and oxygen, and can be inexpensively manu- 
factured on large scales. Because some hydroxy-anthraquinones are 
natural products, there is also the possibility that the electrolyte material 
can be renewably sourced. Second, kinetics: quinones undergo extre- 
mely rapid two-electron redox on simple, inexpensive carbon electro- 
des and do not require a costly precious-metal catalyst. Furthermore, 
this electrode permits higher charging voltages by suppressing the para- 
sitic water-splitting reactions. Third, stability: quinones should exhibit 
minimal membrane crossover owing to their relatively large size and 
charge in aqueous solution as a sulphonate anion. Furthermore, although 
bromine crossover is a known issue in zinc-bromine, vanadium-brom- 
ine and hydrogen-bromine cells, AQDS is stable to prolonged heating in 
concentrated Br,/HBr mixtures (Extended Data Figs 5 and 6), and the 
QBFB can be cycled in HBr electrolyte solutions (Extended Data Fig. 9). 
Fourth, solubility: AQDS has an aqueous solubility greater than 1 M at 
pH 0, and the quinone solution can thus be stored at relatively high energy 
density—volumetric and gravimetric energy densities exceed 50WhI ' 
and 50 Whkg ', respectively. Last, tunability: the reduction potential 
and solubility of AQDS can be further optimized by introduction of 
functional groups such as —-OH. Use of DHAQDS is expected to lead 
to an increase in cell potential, performance and energy density. 


Potential (mV vs SHE) 


reduction potentials of AQDS substituted with -OH groups (black), calculated 
AQDS and DHAQDS values (blue), and experimental values for AQDS and 
DHAQDS (red squares). d, Cyclic voltammogram of AQDS and DHAQDS 
(1 mM) in 1 M H,SO, on a glassy carbon electrode (scan rate = 25 mV s — ty. 


These features lower the capital cost of storage chemicals per kilowatt 
hour, which sets a floor on the ultimate system cost per kilowatt hour 
at any scale. The precursor molecule anthracene is abundant in crude 
petroleum and is already oxidized on large scale to anthraquinone. Sul- 
phonated anthraquinones are used on an industrial scale in wood pulp 
processing for paper’®, and they can be readily synthesized from the 
commodity chemicals anthraquinone and oleum*. In fact, a cyclic vol- 
tammogram of the crude sulphonation product of these two reagents is 
virtually identical to that of pure AQDS (Extended Data Fig. 8). Based 
on this simple electrolyte preparation that requires no further product 
separation, we estimate chemical costs of $21 per kilowatt hour for AQDS 
and $6 per kilowatt hour for bromine’’ (see Methods for information 
on cost calculations). The QBFB offers major cost improvements over 
vanadium flow batteries with redox-active materials that cost $81 per 
kilowatt hour (ref. 18). Optimization of engineering and operating para- 
meters such as the flow field geometry, electrode design, membrane sepa- 
rator and temperature—which have not yet even begun—should lead 
to significant performance improvements in the future, as it has for 
vanadium flow batteries, which took many years to reach the power 
densities we report here®. The use of redox processes in m-aromatic 
organic molecules represents a new and promising direction for cost- 
effective, large-scale energy storage. 


METHODS SUMMARY 


The QBFB comprised a mixture of commercially available and custom-made 
components. Pretreated 2 cm’, stacked (x6) Toray carbon paper electrodes (each 
of which is about 7.5 [um uncompressed) were used on both sides of the cell. Nafion 
212 (50 um thick) was used as a proton-exchange membrane, and PTFE gasketing 
was used to seal the cell assembly. On the positive side of the cell, 120 ml of 3M 
HBrand 0.5 M Bro were used as the electrolyte solution in the fully discharged state; 
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on the negative side, 1 M 2,7-AQDS in 1 M H2SO, was used. AQDS disodium salt 
was flushed twice through a column containing Amberlyst 15H ion-exchange resin 
to remove the sodium ions. Half-cell measurements were conducted using a Ag/ 
AgCl aqueous reference electrode (3M KCI filling solution), a Pt wire counter 
electrode and a 3-mm-diameter glassy carbon disk electrode. For theoretical calcu- 
lations, the total free energies of molecules were obtained from first-principles 
quantum chemical calculations within density functional theory at the level of 
the generalized gradient approximation (GGA) using the PBE functional. Three- 
dimensional conformer structures for each quinone/hydroquinone molecule were gen- 
erated using the ChemAxon suite with up to 25 generated conformers per molecule 
using the Dreiding force field. Generated conformers were used as input structures 
for the density functional theory geometry optimization employed for determining 
the formation energy, which in turn is used to evaluate the reduction potential. In 
the QBEB cost calculation, a price of $4.74 per kilogram (eBioChem) was used for 
anthraquinone. To get the sulphonated form actually used here, anthraquinone 
must be reacted with oleum (H2SO,4/SO3), which adds a negligible cost at scale; this 
cost is not included here. The price of bromine was $1.76 per kilogram, based on 
estimates from the US Geological Survey'’. The cell voltage used to calculate costs 
here was 0.858 V. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 


Full cell measurements. The QBFB comprised a mixture of commercially avail- 
able and custom-made components. Circular endplates were machined out of solid 
aluminium. Current collectors were 3 inch X 3 inch pyrolytic graphite blocks with 
interdigitated flow channels (channel width = 0.0625 inch, channel depth = 0.08 inch, 
landing between channels = 0.031 inch, Fuel Cell Technologies). Pretreated 2 cm’, 
stacked (6) Toray carbon paper electrodes (each of which is about 7.5 um uncom- 
pressed) were used on both sides of the cell. Pretreatment consisted of a 10 min 
sonication in isopropyl alcohol followed by a five hour soak in a hot (50 °C) mixture 
of undiluted sulphuric and nitric acids in a 3:1 volumetric ratio. Nafion 212 (50 ym 
thick) was used as a proton-exchange membrane (PEM, Alfa Aesar), and PTFE 
gasketing was used to seal the cell assembly. Membrane pretreatment was done 
according to previously published protocols’. Six bolts (3/8”, 16 threads per inch) 
torqued to 10.2 Nm completed the cell assembly, and PTFE tubing was used to 
transport reactants and products into and out of the cell. The cell was kept on a hot 
plate and wrapped in a proportional integral derivative (PID)-controlled heating 
element for temperature control. On the positive side of the cell, 120 ml of 3 M HBr 
and 0.5 M Br were used as the electrolyte solution in the fully discharged state; on 
the negative side, 1 M AQDS in 1 M H2SO, was used. HBr was used on the negative 
side instead of H,SO, for stability testing results displayed in Extended Data Fig. 9. 
State-of-charge calculations are based on the composition of the quinone side of 
the cell. 2,7-Anthraquinone disulphonate disodium salt 98% (TCI) was flushed 
twice through a column containing Amberlyst 15H ion-exchange resin to remove 
the sodium ions. Measurements shown here were done at 40 °C. March centrifugal 
pumps were used to circulate the fluids at a rate of approximately 200 ml min. 
For characterization, several instruments were used: a CH Instruments 1100C poten- 
tiostat (which can be used up to +2 A), a DC electronic load (Circuit Specialists) 
for galvanic discharge, a DC regulated power supply (Circuit Specialists) for elec- 
trolytic characterization, and a standard multimeter for independent voltage mea- 
surements. The cell was charged at 1.5 V until a fixed amount of charge ran through 
the cell. During this process, the electrolyte colours changed from orange to dark 
green (AQDS to AQDSH,) and from colourless to red (Br to Br). Periodically, 
the open circuit potential was measured, providing the data inset in Fig. 1b. Also, at 
various SOCs, potential-current behaviour was characterized using the equipment 
described above: a fixed current was drawn from the cell, and the voltage, once 
stabilized, was recorded (Fig. 1b). For the cycling data in Fig. 2b, the potentiostat 
was used for constant-current (+0.5 Acm 7) measurements with cut-off voltages 
of 0 V and 1.5 V. For the cycling data in Fig. 2a, a more dilute quinone solution 
(0.1 Mas opposed to 1 M) was used. Here, the half-cycle lengths were programmed 
to run at constant current for a fixed amount of time, provided the voltage cut-offs 
were not reached, so that half of the capacity of the battery was used in each cycle. 
The voltage cut-offs were never reached during charging, but were reached during 
discharge. Current efficiencies are evaluated by dividing the discharge time by the 
charge time of the previous half-cycle. 

As shown in Fig. 2, current efficiency starts at about 92% and climbs to about 
95% over ~15 standard cycles. Note that these measurements are done near viable 
operating current densities for a battery of this kind. Because of this, we believe this 
number places an upper bound on the irreversible losses in the cell. In any case, 
95% is comparable to values seen for other battery systems. For example, ref. 19 
reports vanadium bromide batteries with current efficiencies of 50-90%, with large 
changes in current efficiency observed for varying membrane conditions. Our system 
will probably be less dependent on membrane conditions because we are storing 
energy in anions and neutral species as opposed to cations, which Nafion can 
conduct reasonably well. 

In Fig. 2b we illustrate the capacity retention of the battery (that is, the number 
of coulombs available for discharge at the nth cycle divided by that available for 
discharge at the (n — 1)th cycle) to be 99.2% on average, which is quite high and 
provides direct evidence that our irreversible losses are below 1%. If we attribute all 
of this loss (the 0.78% capacity fade per cycle) to some loss of redox-active quinone, 
it would be equivalent to losing 0.0006634 moles of quinone per cycle. If we attribute 
all of the loss to bromine crossover (which would react with the hydroquinone and 
oxidize it back to quinone), this corresponds to a crossover current density of 
1.785 mA cm”, which is within the range of the widely varying crossover values 
reported in the literature*®. Note that these crossover numbers can be very sensitive 
to membrane pretreatment conditions. It is also important to mention that, to 
determine very accurate current efficiencies, detailed chemical analyses of the 
electrolyte are necessary. 

Half-cell measurements. These were conducted using a BASi Epsilon EC poten- 
tiostat, a BASi Ag/AgCl aqueous reference electrode (RE-5B, 3 M KCl filling solu- 
tion) anda Pt wire counter electrode. Rotating disk electrode (RDE) measurements 
were conducted using a BASi RDE (RDE-2) and a 3 mm diameter glassy carbon 
disk electrode. Electrode potentials were converted to the standard hydrogen elec- 
trode (SHE) scale using E(SHE) = E(Ag/AgCl) + 0.210 V, where E(SHE) is the 
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potential versus SHE and E(Ag/AgCl) is the measured potential versus Ag/AgCl. 
2,7-Anthraquinone disulphonate disodium salt 98% was purchased from TCI and 
used as received. 1,8-Dihydroxy-anthraquinone-2,7-disulphonic acid was made 
according to the literature procedure”'. The electrolyte solution was sulphuric acid 
(ACS, Sigma) in deionized HO (18.2 MQcm, Millipore). The Pourbaix diagram 
(plot of E° versus pH) shown in Extended Data Fig. 4 was constructed using aqueous 
1mM solutions of AQDS in pH buffers using the following chemicals: sulphuric 
acid (1M, pH 0), HSO, /SO.” (0.1 M, pH 1-2), ACOH/AcO (0.1 M, pH 2.65-5), 
HPO, /HPO,’ (0.1 M, pH_5.3-8), HPO,” /PO,°~ (0.1 M, pH 9.28-11.52), and 
KOH (0.1 M, pH 13). The pH of each solution was adjusted with 1M H2SO, or 
0.1M KOH solutions and measured with an Oakton pH 11 Series pH meter 
(Eutech Instruments). 

RDE studies. All RDE data represent an average of three runs. Error bars in Extended 
Data Figs 2 and 3 indicate standard deviations. Before each run, the glassy carbon 
disk working electrode was polished to a mirror shine with 0.05 jum alumina and 
rinsed with deionized water until a cyclic voltammogram of a solution of 1 mM 
AQDS in 1 M H,SO, showed anodic and cathodic peak voltage separation of 34 to 
35 mV (39 mV for DHAQDS) at a sweep rate of 25 mV s_ | The electrode was then 
rotated at 200, 300, 400, 500, 700, 900, 1,200, 1,600, 2,000, 2,500 and 3,600 r.p.m. 
while the voltage was linearly swept from 310 to 60 mV (250 to — 100 for DHAQDS) 
at 10mVs ! (Extended Data Fig. 1). The currents measured at 60 mV (—100 for 
DHAQDS) (that is, the diffusion-limited current density) versus the square root of 
the rotation rate (cv) is plotted in Extended Data Fig. 2. The data were fitted with a 
straight line, with the slope defined by the Levich equation as 0.620nFACgD*?v_ 6, 
where n = 2, Faraday’s constant F = 96,485 C mol |, electrode area A = 0.0707 cm’, 
AQDS concentration Co = 10 °molcm °, kinematic viscosity v = 0.01 cm’s_}. 
This gives D values of 3.8(1) X 10 °cm’s | for AQDS and 3.19(7) X 10 °cm’?s ? 
for DHAQDS. The reciprocal of the current at overpotentials of 13, 18, 23, 28, 33, 
38 and 363 mV was plotted versus the reciprocal of the square root of the rotation 
rate (Fig. 3b and Extended Data Fig. 2). The data for each potential were fitted with 
a straight line; the intercept gives the reciprocal of ix, the current in the absence of 
mass transport limitations (the extrapolation to infinite rotation rate). A plot of 
log, o(ix) versus overpotential was linearly fitted with a slope of 62 mV (AQDS) and 
68 mV (DHAQDS) defined by the Butler-Volmer equation as 2.3aRT/nF (Extended 
Data Fig. 3), where R is the universal gas constant, T is temperature in kelvin and « 
is the charge transfer coefficient. This gives « = 0.474(2) for AQDS and 0.43(1) for 
DHAQDS. The x-intercept gives the log of the exchange current ig, which is equal 
to FACokg, and gives kg = 7.2(5) X 10 *cms / for AQDSand 1.56(5) X 10 *cms ! 
for DHAQDS. 

Stability studies. AQDS (50 mg) was dissolved in 0.4 ml of DO, and treated with 
100 ull of Br). The ‘H and °C NMR spectra (Extended Data Figs 5a, b and 6a, b) 
were unchanged from the starting material after standing for 20 h at 25 °C. AQDS 
(50 mg) was then treated with 1 ml of 2M HBr and 100 ul of Br). The reaction was 
heated to 100°C for 48h and evaporated to dryness at that temperature. The 
resulting solid was fully dissolved in D,O giving unchanged 'H and °C NMR 
(Extended Data Figs 5c and 6c); however, the 1H NMR reference was shifted due to 
residual acid. These results imply that bromine crossover will not lead to irrevers- 
ible destruction of AQDS. 

Sulphonation of anthraquinone and electrochemical study. 9,10-Anthraquinone 
was treated with H,SO, (20% SOs) at 170°C for 2h according to a literature pro- 
cedure®. The resulting red solution, containing roughly 37% AQDS, 60% 9,10- 
anthraquinone-2,6-disulphonic acid and 3% 9,10-anthraquinone-2-sulphonic 
acid, was diluted and filtered. A portion of this solution was further diluted with 
1M H,SO, to ~1 mM total anthraquinone concentration. The cyclic voltammo- 
gram (Extended Data Fig. 8) is similar to that of pure AQDS, though the anodic/ 
cathodic peak current density is broadened due to the presence of the multiple 
sulphonic acid isomers. 

Theory and methods. We used a fast and robust theoretical approach to deter- 
mine the E° of quinone/hydroquinone couples in aqueous solutions. We employed 
an empirical linear correlation of AHg the heat of formation of hydroquinone at 
0 K from the quinone and hydrogen gas, to the measured E” values”. Following the 
treatment of ref. 22, the linear correlation is described as AG = —nFE°, where AG 
is the difference in total free energy between quinone and hydroquinone, n is the 
number of electrons involved in the reaction and F is the Faraday constant. The 
entropy contributions to the total free energies of reaction have been neglected 
because the entropies of reduction of quinones are found to be very similar**”’, and 
the F° of the oxidation-reduction system is linearly expressed as (—nF)~'AH;+ b, 
where b is a constant. It was also assumed that the reduction of quinones takes 
place in a single-step reaction involving a two-electron two-proton process”. The 
total free energies of molecules were obtained from first-principles quantum 
chemical calculations within density functional theory (DFT) at the level of the 
generalized gradient approximation (GGA) using the PBE functional’*. The pro- 
jector augmented wave (PAW) technique anda plane-wave basis set” asimplemented 
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in VASP’? were employed. The kinetic energy cut-off for the plane-wave basis 
was set at 500 eV, which was sufficient to converge the total energies on a scale of 
1 meV per atom. To obtain the ground-state structures of molecules in the gas 
phase, we considered multiple initial configurations for each molecule and opti- 
mized them in a cubic box of 25 A using I-point sampling. The geometries were 
optimized without any symmetry constraints using the conjugate gradient (CG) 
algorithm, and the convergence was assumed to be complete when the total 
remaining forces on the atoms were less than 0.01eV At. 

The search for conformational preference through theoretical calculations for each 
hydroxylated quinone is crucial because of the significant effects of intramolecular 
hydrogen bonds on the total free energies of the molecules**. Three-dimensional 
conformer structures for each quinone/hydroquinone molecule were generated using 
the ChemAxon suite (Marvin 6.1.0 by ChemAxon, http://www.chemaxon.com) 
with up to 25 conformers generated per molecule using the Dreiding force field?’. 
The conformers generated were used as input structures for the DFT geometry 
optimization employed for determining AH, which in turn is used to estimate E° 
and G?,. 

To calculate the E° of the hydroxy-substituted AQDS molecules (Fig. 3c), the 
correlation between AHrand E” was calibrated from experimental data on six well- 
characterized quinones”. Specifically, we used the experimental values of the aqueous 
F’and computed AH; of 1,2-benzoquinone, 1,4-benzoquione, 1,2-naphthoquinone, 
1,4-naphthoquinone, 9,10-anthraquinone and 9,10-phenanthrene™. The training 
set ensures that the calibration plot addresses most classes and aspects of quinones, 
including two quinones each from 1-ring (benzoquinone), 2-ring (naphthoquinone) 
and 3-ring (anthraquinone and phenanthrene) structures. In addition, the experi- 
mental values of E” of the training set spanned 0.09 V (9,10-anthraquinone) to 
0.83 V (1,2-benzoquinone), providing a wide range for EF? (Extended Data Fig. 7). 
The linear calibration plot for E° yields an R* = 0.97 between the calculated AH; 
and E° (Extended Data Fig. 7). 

The G2, values of the quinones in water were calculated using the Jaguar 8.0 
program in the Schrédinger suite 2012 (Jaguar, version 8.0, Schrédinger). The 
standard Poisson-Boltzmann solver was employed***’. In this model, a layer of 
charges on the molecular surface represents the solvent. G°,,, was calculated as the 
difference between the total energy of the solvated structure and the total energy of 
the molecule in vacuum. A more negative value for G°,,, corresponds to a quinone 
that is likely to have a higher aqueous solubility. An absolute prediction of the 
solubility is not readily available, as the accurate prediction of the most stable forms 
of molecular crystal structures with DFT remains an open problem*’. 

Cost calculations. These were done using the following formula: C = (3.6 X 10°) X 

(PM)/(nFE), where Cis the cost in US dollars of the compound per kilowatt hour, P 
is the cost in US dollars per kilogram, M is the molecular mass of the compound, 
F is Faraday’s constant (96,485 Cmol '), n is the number of moles of electrons 
transferred per mole of storage compound (two for the QBFB) and E is the open- 
circuit voltage (V) of the storage device. In calculating the price for the anthraquinone- 
bromine battery, a price of $4.74 per kilogram (eBioChem) was used for anthraquinone 


(note that, to get the sulphonated form actually used here, anthraquinone must be 
reacted with oleum (H2SO,/SOs), which adds a negligible cost at scale; this cost is 
not included here). The price of bromine was $1.76 per kilogram, based on esti- 
mates from the US Geological Survey’’. The cell voltage used to calculate costs here 
was 0.858 V. For vanadium, costs were calculated from USGS prices from 2011" of 
vanadium pentoxide at $14.37 per kilogram, and the cell voltage used was 1.2 V. 
Balance-of-system costs have not been estimated because the technology is too 
immature. 
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Extended Data Figure 1 | Plot of potential versus current density at 

different rotation rates of the RDE. The solution is 1 mM DHAQDS (1 mM 
in 1M H,SO,), using a rotating disk electrode (RDE) of glassy carbon. Rotation 
rates are 200, 300, 400, 500, 700, 900, 1,200, 1,600, 2,000, 2,500 and 3,600 r.p.m. 
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Extended Data Figure 2 | Levich and Koutecky-Levich plots obtained DHAQDS in 1 M H2SO,j (slope of 3.94(6) pA si rad 1? 
using the RDE. a, Levich plot (limiting current versus square root of gives D = 3.19(7) X 10 °cm’s ).¢ Koutecky-Levich plot (i! versus w ") 


rotation rate w) of 1!mM AQDS in 1 M H2SO,j (the fitted line has a slope of of 1mM DHAQDS in 1M H)SOx. The current response, i, is shown for 
4.53(2) LA sl? rad V7, giving D = 3.8(1) X 10° °cm?s'). Data are an average seven different AQDS reduction overpotentials 7. 
of three runs; error bars indicate the standard deviation. b, As a but for 
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Extended Data Figure 3 | Fit of Butler-Volmer equation. Constructed of three runs; error bars indicate the standard deviation. a, AQDS: best-fit 
using the current response in the absence of mass transport at low AQDS line has the equation y = 62(x + 4.32). This yields « = 0.474(2) and 
reduction overpotentials; ix is the current extrapolated from the zero-intercept kg = 7.2(5) X 10-*cms_'. b, DHAQDS: best-fit line is the function 

of Fig. 3b and Extended Data Fig. 2c (infinite rotation rate). Dataareanaverage _y = 68(x + 3.95). This yields « = 0.43(1) and ky = 1.56(5) X 10 7cms 1. 
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Extended Data Figure 4 | Pourbaix diagram (E° vs pH) of AQDS. Data are 
fitted to three solid lines indicating slopes of -59mV pH_', —30mV pH! 
and 0 mV pH", corresponding to two-, one- and zero-proton processes, 
respectively. Dashed lines linearly extrapolate the one- and zero-proton 
processes to give E® values of 18 mV (2e /1H*) and —296 mV (2e /0H*). 
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Extended Data Figure 5 | 'H NMR (500 MHz, D,0) spectra. a, Spectrum of sample, 20h after addition of Br. c, 'H NMR of AQDS treated with 2M HBr 
AQDS: chemical shift 6 = 7.99 p.p.m. versus tetramethylsilane (TMS) (doublet and Br and heated to 100 °C for 48 h. The peaks are shifted due to presence of 
(d), coupling constant J = 2 Hz, 1,8 C-H), 7.79 p.p.m. (doublet of doublets, trace HBr, which shifted the residual solvent peak due to increased acidity. 

J =2 and 8 Hz, 4,5 C-H), 7.50 p.p.m. (d, J = 8 Hz, 3,6 C-H). b, The same Coupling constants for each peak are identical to a. 
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Extended Data Figure 6 | *C NMR (500 MHz, D0) spectra. a, AQDS, 128.59 p.p.m. (C 4,5), 124.72 p.p.m. (C 1,8). b, The same sample, 24h after 

5 = 181.50 p.p.m. versus TMS (C 9), 181.30 p.p.m. (C 10), 148.51 p.p.m. addition of Br2. c, "*C NMR of AQDS treated with 2 M HBr and Br, and heated 

(C 2,7), 133.16 p.p.m. (C 11), 132.40 p.p.m. (C 12), 130.86 p.p.m. (C 3,6), to 100°C for 48h. 
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Extended Data Figure 7 | Calibration model for AH; and experimental F°. 
This shows a linear relationship (red dashed line; R= 0.97) between calculated 
AH; (this work) and experimental E® (from the literature) of six quinones 

in aqueous solutions: BQ, benzoquinone; NQ, naphthoquinone; AQ, 
anthraquinone; and PQ, phenanthraquinone. 
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Extended Data Figure 8 | AQDS cyclic voltammograms. Black curve, 
obtained for a 1 mM solution of AQDS in 1 M H,SO, on a stationary glassy 
carbon working electrode. Red curve, obtained for a crude anthraquinone 
sulphonation solution containing a mixture of AQDS, 9,10-anthraquinone-2,6- 
disulphonic acid and 9,10-anthraquinone-2-sulphonic acid diluted to 1 mM 
total anthraquinone in 1 M H2SOx. 
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Extended Data Figure 9 | Flow-battery cycling behaviour with HBr 
electrolyte on both sides. Data collected by cycling the current at 0.2 Acm — 
40 °C using a 2M HBr + 0.5 M Br, solution on the positive side and a 2M 
HBr + 0.1 M AQDS solution on the negative side; cell potential versus time 
performance is comparable to data in Fig. 2. 
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Extended Data Table 1 | AQDS screened by theoretical calculations 
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Merging allylic carbon-hydrogen and selective 
carbon-carbon bond activation 


Ahmad Masarwa!, Dorian Didier, Tamar Zabrodskil, Marvin Schinkel’, Lutz Ackermann? & Ilan Marek! 


Since the nineteenth century, many synthetic organic chemists have 
focused on developing new strategies to regio-, diastereo- and enan- 
tioselectively build carbon-carbon and carbon-heteroatom bonds 
in a predictable and efficient manner’ ’. Ideal syntheses should use 
the least number of synthetic steps, with few or no functional group 
transformations and by-products, and maximum atom efficiency. 
One potentially attractive method for the synthesis of molecular 
skeletons that are difficult to prepare would be through the selective 
activation of C-H and C-C bonds**, instead of the conventional 
construction of new C-C bonds. Here we present an approach that 
exploits the multifold reactivity of easily accessible substrates’ with 
a single organometallic species to furnish complex molecular scaf- 
folds through the merging of otherwise difficult transformations: 
allylic C-H and selective C-C bond activations’®”’. The resulting 
bifunctional nucleophilic species, all of which have an all-carbon 
quaternary stereogenic centre, can then be selectively derivatized by 
the addition of two different electrophiles to obtain more complex 
molecular architecture from these easily available starting materials. 
The direct transformation of C-H bonds into C-C and C-heteroatom 
bonds by means of C-H activation* has had an enormous impact on 
advancing the field of chemical synthesis, because it led to the design of 
novel, selective, economic and efficient processes for the utilization of 
hydrocarbons**. Carbon-carbon bond activation by soluble metal 
complexes, an alternative strategy, is based on either oxidative addition 
or B-carbon and radical cleavage’. The former approach, requiring the 
introduction of pincer-type ligands, has expanded the field by bringing 
the metal centre close to the otherwise ‘hidden’ C-C bond through 
intramolecular chelation®. However, this system puts some limitations 
on synthesis and, except for hydrogenation and the silylation of unac- 
tivated C-C bonds’, no other applications were described. The latter 
approach relies on strained substrates that do not require such pre- 
coordination, but the selectivity of the C-C bond activation remains a 
major problem when C-C bonds are differentially substituted’. 
Selective C-H bond activation and selective C-C bond activation 
are cutting-edge methods that, in combination, may be ideal for acces- 
sing complex molecular architectures. Indeed, the ability to design the 
preparation of complex molecular skeletons through a succession of 
allylic C-H and selective C-C bond activations would pave the way for 
new directions in organic synthesis’. Unifying these two independ- 
ent strategies into a single approach would necessitate perfect control 
of both modes of activation. Here we describe the development and 
study of reactions in which the challenges of selective C-H bond activa- 
tion and selective C-C bond activation are overcome, demonstrating 
the efficacy of a synthetic approach that could be converted to the pre- 
paration of all-carbon quaternary stereogenic centres in acyclic sys- 
tems. We predicted that a direct and expedient approach to such 
complex molecular frameworks containing several stereogenic centres 
in acyclic systems could result from either w-ene-cyclopropanes 1 or 
alkylidenecyclopropanes 2. Ideally, it could proceed through a series of 
allylic C-H activations (‘metal-walk’ reactions) followed by a selective 
C-C bond activation of the three-membered ring to give intermediate 


3. Then selective reactions with two different electrophiles would func- 
tionalize 3 into the linear adduct 4, which possesses new stereogenic 
centres including the original all-carbon quaternary stereocentre (path A; 
Fig. 1a). Alternatively, w-ene-cyclopropanes possessing a leaving group 5, 
treated under the same conditions, would undergo the allylic C-H 
activation followed by a selective C-C fragmentation to lead to non- 
conjugated dienes 6 (path B; Fig. 1b). 

This approach would require the use of a single organometallic species 
that could mediate a combination of allylic C-H activation, selective 
C-C bond activation and selective functionalization with two different 
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Figure 1 | General approach combining allylic C-H and selective C-C 
bond activation towards functionalized adducts. a, On path A, w-ene- 
cyclopropanes 1 undergo a metal-mediated allylic C-H activation followed bya 
selective C-C bond activation to lead to bismetalated species 3. Because the 
reactivity of an allylmetal species to electrophiles is higher than that of 

the alkylmetal species, the first electrophile (E’-X) reacts selectively to 
functionalize the y-carbon whereas the second electrophile (E°-X) reacts with 
the remaining sp* organometallic species to give the functionalized acyclic 
adduct 4. When alkylidenecyclopropane species 2 are used, the chemistry 
proceeds similarly and 4 is formed. b, Alternatively (on path B), when 
@-ene-cyclopropane 5 is used, the sequence goes through the combined allylic 
C-H activation/selective C-C bond cleavage promoted by an elimination 
reaction to give the non-conjugated dienes 6 possessing the all-carbon 
quaternary stereocentre. Me, methyl. 
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electrophiles. Our solution to this salient problem is based on the Zr-walk 
chemistry we developed for the transformation of simple w-unsaturated 
fatty alcohols’? and -non-conjugated enol ether derivatives'* into 
allyl- and dienylzirconocene derivatives, respectively’*. Recently, a 
related enantioselective Heck-type arylation of alkenyl alcohols using 
a Pd walk as a redox-relay strategy was reported'*””. 

Our mechanistic rationale was to transform the model w-alkenyl 
cyclopropane species 1a by means of a zirconocene-mediated migra- 
tion of the double bond with the Negishi reagent’*. Indeed, C,HgZrCp, 
would react with the remote double bond of 1a to give the correspond- 
ing zirconacyclopropane, 7. Then, through an allylic C-H bond activa- 
tion, the 7°-allyl intermediate 8 would be generated as a transient 
species and, after hydride insertion, would give rise to the new zirco- 
nacyclopropane, 9, possessing a C-Zr bond in the B-position of the 
three-member ring'®”® (Fig. 2). When 9 is formed, an irreversible C-C 
bond activation proceeding through ring cleavage should take place. If 
the C-C bond activation occurs through the C.-C; bond, a primary 
acyclic organometallic species, 3a, could be obtained. Alternatively, 
ring opening through the activation of the C,-C, bond (not shown 
in Fig. 2), would result in an unfavourable trisubstituted organomet- 
allic species. On the assumption that primary organometallic species 
are kinetically and thermodynamically more stable than tertiary 
organometallic species’, we anticipate an exclusive formation of the 
constitutional isomer 3a in the activation reaction. Hence, the presence 


of substituents on the cyclopropyl unit furnishes a highly selective C-C 
bond activation”. 

Importantly, even if our hypotheses regarding selective bond activa- 
tion were correct, the synthetic problem would not yet be fully solved, 
because it is necessary to differentiate the reactivity of the two C-metal 
bonds in 3a (or 3b) with two different electrophiles. It occurred to us 
that in 3a (or 3b), the allylzirconocene moiety [C,—Zr] might be signi- 
ficantly more reactive than the alkylzirconocene moiety [C3-Zr] (ref. 22). 
Therefore, the addition of the first electrophile should selectively react 
to give 10 via a S;2' reaction, and 10 will then go on to react with a second 
electrophile to form 4 (Fig. 2). On addition of the Negishi reagent to la 
(n = 1) and subsequent hydrolysis, we isolated 4a as the E-isomer in 
excellent yield after hydrolysis (Fig. 2). When D2O was added at the 
end of the sequence, the bisdeuterated compound 4a(D) was similarly 
afforded with an exclusive introduction of deuterium both at the allylic 
and primary alkyl positions. This tandem allylic C-H activation/selective 
C-C activation was not limited to m-ene-cyclopropanes with a one-carbon 
tether. Compounds 1b-1d (with respective carbon tethers m = 2,3 and 
6) also underwent this transformation via an initial migration of the 
zirconocene through the alkyl chain with formation of 4b-4d in good 
yields”. This reaction sequence always led to a selective C-C activation 
of the 1,1-disubstituted w-ene-cyclopropanes 4a—4e (R’, R= alkyl or 
aryl) because it led to the preferential formation of a primary alkyl- 
zirconacyclobutane over a tertiary alkyl-zirconacyclobutane. Because 
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Figure 2 | Zirconocene-promoted allylic C-H and C-C bond activation of 
@-ene-cyclopropanes. C,HgZrCp, reacts with the remote double bond of 
1a to give 7 and free butene. Then, through an allylic C-H bond activation, 
the 7°-allyl intermediate 8 is generated and, after hydride insertion, 
zirconacyclopropane 9 is formed. Then an irreversible step occurs 
transforming the zirconacyclopropane 9 into allyl-alkyl zirconocene species 
3a or 3b. From the two possible C-C bonds that could be activated (C;-C, 
versus C,-C; in 9), only the C.-C; bond is cleaved, leading to the primary and 
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allyl-bismetalated organozirconocene derivative 3a, which may be represented 
as its o-allylic zirconocene 3b. Because the C,—Zr fragment is more reactive to 
electrophiles than is the C;-Zr fragment, the addition of the first electrophile 
reacts selectively at room temperature (RT = 25 °C) with the allylzirconocene 
moiety. Also shown are representative examples of the scope of the combined 
allylic C-H activation, selective activation and bisfunctionalization of the 
resulting organozirconocene species en route to acyclic fragments. Bu, butyl; 
Et, ethyl; Hex, hexane. 
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the allylic [C.-Zr] bond is more reactive to electrophiles than is the 
primary alkyl-[C,-Zr] bond, the addition of a carbonyl electrophile 
(E'-X = MeCOMe) selectively functionalizes the allylmetal fragment, 
whereas the second electrophile (E’-X = H;0%, I,) reacts with the 
remaining alkylzirconocene species to give 4f-4i as E-isomers in good 
yields. 

This multifold functionalization of a single molecule was also suc- 
cessfully applied to alkylidenecyclopropane derivatives (ACPs) 2 to 
furnish linear adducts 4. Alkylidenecyclopropanes are highly reactive 
substrates, often giving rise to complex mixtures of products, and 
development of mild and selective transformations for this class of 
compounds is very useful*”*. In this context, the Zr walk could lead 
to particularly interesting transformations of ACPs (Fig. 3). To further 
our mechanistic understanding, we used CD;-labelled ACP 2a as a 
substrate. The migration of a deuterium atom into the vinylic position 
in the product 4j corroborated our hypothesis of allylic C-H activation 
(Fig. 3), as follows. 


LETTER 


When ACP 2a is treated with the Negishi reagent, the zirconocene- 
mediated allylic C-D bond activation releases the initial strain of the 
alkylidenecyclopropane system to furnish zirconacyclopropane 943. It 
then undergoes a cyclopropane C-C bond activation to give zircona- 
cyclobutane 343, which is functionalized selectively at the allylic posi- 
tion to give 4j in 82% yield by addition of acetaldehyde. In all cases, the 
combined allylic C-H bond activation followed by the selective C2-C; 
bond activation of alkylidenecyclopropane proceeded smoothly and 
only the linear products 4k-4w were formed with no traces of activa- 
tion along the C,—-C, bond, even though the activation could have 
otherwise occurred at the benzylic position (that is, 41 and 4n). 

Because enantiomerically enriched alkylidenecyclopropanes can 
easily be prepared”, and because the chiral quaternary centre is at no 
risk of racemization in the process, the optical purity can be assumed to 
be identical throughout the rest of the synthesis. Indeed, when enan- 
tiomerically enriched 2b (R' Bu, R?=Et, R? =H; enantiomeric 
ratio, 98:2) was treated with the Negishi reagent, 4k was obtained with 
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Figure 3 | Zirconocene-promoted allylic C-H and C-C bond activation of 
alkylidenecyclopropanes. When alkylidenecyclopropane 2a is treated with 
C4HsZrCpp, an allylic C-H activation reaction occurs to give the intermediate 
9, which undergoes a selective C-C bond activation leading to the intermediate 
3 (Fig. 2). A mechanistic probe and representative examples of substrate scope 
for the reaction of alkylidenecyclopropanes with Cp,ZrC,Hg are shown. We 
note that intermediate 3 reacts, after heating at 60 °C in THF, with various 


carbony] derivatives as first electrophiles to generate a new stereogenic centre at 
a remote position with very high diastereoselectivity before addition of the 
second electrophiles. If the reaction is performed without heating before the 
addition of the first electrophile, two diastereomers are obtained in a 3:1 ratio. 
Therefore, we proposed that the heating allows the isomerization of the 
substituted allyl fragment into the E-isomer (A). Bn, PhCH); DR, diastereomeric 
ratio; ER, enantiomeric ratio. 
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Figure 4 | Merging allylic C-H activation and fragmentation en route to 
non-conjugated dienes. C,HgZrCp, reacts with the remote double bond of 
@-ene-cyclopropane methyl ether 5 to induce an allylic C-H activation 
followed by a fragmentation of the three-membered ring. The resulting 
allylzirconocene, 12, reacts selectively with AcOH to give only the 


the same enantiomeric ratio (98:2). Regardless of the E/Z ratio of the 
starting ACPs, 2, only the E-isomer was formed in the opened adducts 
(E/Z > 98:2). When an excess of D;0~ was added to the ring-opened 
product resulting from the treatment of alkylidenecyclopropane 2e 
(R' Me, R? = Et, R® CH,Ph) with the Negishi reagent, 40 was 
isolated in 90% yield with a quantitative incorporation of deuterium 
at the allylic and terminal methyl positions. Consistent with our pre- 
vious results in which the allylzirconocene [C,-Zr] is more reactive to 
electrophiles than is alkylzirconocene [C3-Zr], the addition of a first 
electrophile (E'-X = HCHO) led to selective functionalization of the 
allyl fragment that was then hydrolysed under acidic work-up to give 
alcohol 4p in 90% yield (Fig. 3). This selective reactivity of the first 
electrophile can be extended to other aldehydes and ketones without 
altering the yield of the overall transformation (that is, 4q and 4r, 
respectively). The remaining primary organozirconocene species can 
then be trapped by a second electrophile (E”X = I,) to give the bifunc- 
tionalized adduct 4s. 

This strategy holds potential for induction of 1,4-diastereoselectivity 
in acyclic systems through the transfer of chiral information from 
remote positions. To generate such remote diastereoselectivity, alky- 
lidenecyclopropane 2i (R' = Ph, R? = Pr, RP? = Me) was treated under 
the same conditions with the Negishi reagent and the reaction mixture 
was heated to 60 °C in tetrahydrofuran (THF) for a few hours before 
addition of the first electrophile (Supplementary Information). In this 
case, the open adduct 4t was obtained, after hydrolysis, with a diaster- 
eomeric ratio of 98:2 in good isolated yield. Interestingly, without 
heating the reaction mixture, a modest 3:1 ratio is obtained and may 
be attributed to conformational isomers of the substituted allylzirco- 
nocene fragment before reaction with the aldehyde; by heating to 
60°C, the most stable thermodynamically E-isomer is quantitatively 
formed (A; Fig. 3). This sequence can be extended to other versatile 
acyclic adducts while diastereomeric ratios remain excellent (forma- 
tion of 4t-4w). Finally, we applied this combined Zr-walk/C-C bond 
fragmentation strategy to @-ene-cyclopropanes that possess a stra- 
tegically placed leaving group on the cyclopropane ring (Fig. 4). 
When cyclopropanes 5a-5g, which are easily prepared by carbometa- 
lation reaction of cyclopropenyl esters”’, were treated with the Negishi 
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non-conjugated diene 6 with an excellent E/Z selectivity. After the allylic C-H 
activation, the zirconacyclopropane 11 undergoes a fragmentation reaction to 
lead to the allylzirconocene species 12. Acetolysis leads to non-conjugated 
dienes 6 possessing the all-carbon quaternary stereocentres. Ac, acetyl. 


reagent, dienes 6a—6g were obtained in good yields and with excellent 
E/Z ratios after hydrolysis. Mechanistically, the allylic C-H activation 
leads to zirconacyclopropane 11, which undergoes fragmentation to 
give the allylzirconocene species 12. Owing to the potential steric 
interactions between the zirconocene and the quaternary stereocentre, 
we proposed that only the isomer 12 would be present, with no evid- 
ence for metalotropic equilibrium, under its o-bond form. Hence, 12 
reacted with deuterated acetic acid at low temperature, by means of an 
Sp2’ reaction, to give 6a(D) with an excellent positional selectivity. 
This sequence was then applied to several other differentially substi- 
tuted cyclopropanes, 5a—5f, to give non-conjugated dienes in all cases. 
This reaction is not restricted to the one-carbon tether; for example, 
6g, which has a three-carbon tether (n = 3), was also obtained in good 
yields. Because enantiomerically enriched cyclopropenyl esters can be 
easily prepared”, the formation of non-conjugated dienes possessing 
an all-carbon quaternary stereocentre is therefore possible through 
this combined sequence of allylic C-H activation-ring fragmentation 
(Fig. 4). 

We have shown that any cyclopropane species possessing a remote 
double bond such as w-ene-cyclopropanes 1 and 5 or alkylidenecyclo- 
propanes 2 can be easily transformed into acyclic molecular fragments 
possessing two stereogenic centres including the original challenging 
all-carbon quaternary stereocentre*~*°. This process goes through zir- 
conocene-mediated allylic C-H activations followed by highly select- 
ive C-C bond activations or fragmentations. Owing to the presence of 
the quaternary centre, the resulting bifunctional nucleophilic species 
are further derivatized with two different electrophiles to give more 
complex molecular architectures arising from simple starting materials. 
Overall, this work emphasizes the feasibility of synthetic approaches to 
exploit the multifold reactivity of a single substrate to furnish advanced 
molecular scaffolds through the combination of transformations that 
are otherwise difficult to perform. 


METHODS SUMMARY 


Toa solution of zirconocene dichloride (496.5 mg, 1.70 mmol) in THF (45 ml) cooled 
to —78 °C was added dropwise n-BuLi (1.40 M in hexane, 2.4 ml, 3.4 mmol). The 
resulting solution was stirred for 1 h at —78 °C. A solution of alkylidenecyclopropane 
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2 (1.00 mmol) in THF (5 ml) was added, and the reaction mixture was slowly 
warmed to 10°C. After stirring for 13h, the reaction was heating to 60°C for 
6h. After cooling the reaction to 10 °C, carbonyl derivative (4 mmol) was added 
at 10°C and the reaction mixture was stirred for 40 min at the same temperature. 
Then the second electrophile (5 mmol) was added and the reaction mixture was 
stirred for 3h. An aqueous solution of HCl (1 M) was then added at 0 °C and after 
extraction with Et,O, the organic extracts were washed with water, NaHCO; and 
brine, dried (MgSO,), filtered, and concentrated in vacuo to give a crude oil. The 
product 4 was obtained pure by silica-gel chromatography using gradient mixtures 
of ethyl acetate and n-hexane as eluents. 
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Three-quarters of the oceanic crust formed at fast-spreading ridges 
is composed of plutonic rocks whose mineral assemblages, textures 
and compositions record the history of melt transport and crystal- 
lization between the mantle and the sea floor. Despite the impor- 
tance of these rocks, sampling them in situ is extremely challenging 
owing to the overlying dykes and lavas. This means that models for 
understanding the formation of the lower crust are based largely on 
geophysical studies’ and ancient analogues (ophiolites)”* that did 
not form at typical mid-ocean ridges. Here we describe cored intervals 
of primitive, modally layered gabbroic rocks from the lower plutonic 
crust formed at a fast-spreading ridge, sampled by the Integrated 
Ocean Drilling Program at the Hess Deep rift. Centimetre-scale, mo- 
dally layered rocks, some of which have a strong layering-parallel 
foliation, confirm a long-held belief that such rocks are a key consti- 
tuent of the lower oceanic crust formed at fast-spreading ridges**. 
Geochemical analysis of these primitive lower plutonic rocks—in 
combination with previous geochemical data for shallow-level plu- 
tonic rocks, sheeted dykes and lavas—provides the most completely 
constrained estimate of the bulk composition of fast-spreading 
oceanic crust so far. Simple crystallization models using this bulk 
crustal composition as the parental melt accurately predict the bulk 
composition of both the lavas and the plutonic rocks. However, the 
recovered plutonic rocks show early crystallization of orthopyroxene, 
which is not predicted by current models of melt extraction from the 
mantle’ and mid-ocean-ridge basalt differentiation*’. The simplest 
explanation of this observation is that compositionally diverse melts 
are extracted from the mantle and partly crystallize before mixing to 
produce the more homogeneous magmas that erupt. 

The gabbroic rocks that make up the lowermost oceanic crust formed 
at fast-spreading ridges, such as the East Pacific Rise (EPR), have long 
been assumed to be modally layered and primitive in composition*®!°"". 
Igneous layering, and a layering-parallel foliation, are nearly ubiquitous 
in the lower plutonic sections of many ophiolites**"°, and explaining the 
formation of these layered rocks has become central to models for the 
accretion of the plutonic crust at fast-spreading ridges**”. Accretion models 


have evolved from describing layers accumulating along the floors of large 
magma bodies** to describing layers developing in sill-like magma bodies 
focused at the top of the crystal mush within axial magma chambers or 
distributed throughout the crustal mush zone, or both***. Until now, 
however, no significant cored intervals of layered gabbros have been reco- 
vered from the lower plutonic section at a modern fast-spreading ridge. 

Integrated Ocean Drilling Program (IODP) Expedition 345 was con- 
ceived as a test of whether modern fast-spreading crust shows layering 
similar to many ophiolites and as a test of models for the transport of 
parental melts into the crust and the differentiation of these melts 
within the crust. To sample the generally inaccessible lower plutonic 
crust, the expedition took advantage of the tectonic window at the Hess 
Deep rift (HDR) in the equatorial Pacific Ocean'’”” (Fig. 1). This site is 
unique in that it is the only place where the lower crust and the upper 
crust have been extensively sampled by submersible or remotely ope- 
rated underwater vehicle'’* and drilling'’* (Ocean Drilling Program 
(ODP) Leg 147), and previous studies of known sea-floor exposures of 
lower plutonic rocks have suggested that layering exists'’’*. At IODP 
Site U1415, primitive olivine gabbros and troctolites were recovered at 
one 35-m-deep hole (U1415]) and two ~110-m-deep holes (U1415J 
and U1415P), located within 100m of each other (Extended Data 
Fig. 1). Sampling of primitive layered gabbro and troctolite series at 
Site U1415 thus provides the final part of the most complete composite 
section of fast-spreading EPR crust so far. 

Simple modally layered and irregularly banded rocks, collectively 
called the layered gabbro series, were recovered from all three drill holes, 
comprising >50% of the core. The layered gabbro series from holes 
U1415I and U1415] show simple modal layering, with or without con- 
current grain size variations, on a scale of centimetres to decimetres 
(Fig. 2a and Extended Data Fig. 2). Layers include troctolite, olivine 
gabbro, gabbro and gabbronorite with local variations in texture (for 
example the presence or absence of clinopyroxene oikocrysts). Layering 
is reminiscent of ‘dynamic’ layering resulting from magmatic flow'® 
commonly found in layered mafic intrusions and some ophiolites. A 
layering-parallel foliation exists throughout these rocks that is commonly 
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Figure 1 | Tectonic setting of the HDR and the location of IODP Site U1415. 
The HDR formed by deep lithospheric extension in front of the westwards- 
propagating Cocos—Nazca spreading centre, exposing oceanic crust that 
formed at the fast-spreading (130 mm yr_') EPR. Upper crustal lavas and dykes 
are exposed along the northern and southern escarpments, shallow-level 
gabbros along the northern escarpment and the western intrarift ridge, and 
lower-level gabbros along the southern slope of the intrarift ridge''""*. a, Map of 


strong (Fig. 2c) and locally anastomoses around large oikocrysts. In con- 
trast, the layered gabbro series in Hole U1415P shows irregular banding 
that is identified by modal and grain size variations, with all of the same 
lithologies present in the simple layers of holes U1415] and U1415], but 
also includes rare anorthositic bands (Fig. 2b). Grain size variation is 
much more extreme and heterogeneous than in the simple layers, bands 
can be discontinuous, and one lithology can enclose another. Additionally, 
the boundaries between bands are generally less planar than the simple 
layers and show abrupt changes in mineralogy leading to asymmetric 
distributions of distinct leucocratic and melanocratic bands (Fig. 2b), 
and mineral foliations are weak or absent. This banding is reminiscent 
of non-dynamic layering in layered mafic intrusions that is the result of 
varying rates of nucleation and growth, and postcumulus processes'”"*. 
The troctolite series at the base of holes U1415J and U1415P contain 
melanocratic to leucocratic troctolite with little or no layering or band- 
ing and a weak-to-moderate foliation (Extended Data Fig. 1). Evolved 
lithologies such as FeTi oxide gabbros and felsic veins, which are pre- 
valent in the upper gabbros’’, are strikingly absent throughout the cores, 
indicating that evolved residual melt was efficiently extracted from the 
lower plutonic crust. Also absent are ultramafic rocks, suggesting the 
recovered lithologies are not part of the mantle transition zone. 

The foliation and layering in the layered gabbro series provide impor- 
tant constraints on the processes of crustal accretion. There is little sub- 
solidus crystal plastic deformation, meaning that the foliations were 
formed early, while the rocks were still partly molten. In addition, olivine 
commonly exhibits skeletal morphologies, which limits the amount of 
grain-scale strain that some of the rocks may have experienced at low 
melt fractions. The abundance of layering in the material recovered 
from Site U1415, along with the absence of intermixed evolved litho- 
logies, distinguishes the HDR lower gabbroic crust from crustal sections 
recovered from slow-spreading ridges (see, for example, ref. 19). This 
supports models that invoke a strong spreading rate and thermal control 
on magma chamber processes at mid-ocean ridges”. Furthermore, the 
occurrence of layering that resembles both dynamic’* and non-dynamic”” 
layering in layered mafic intrusions suggests multiple mechanisms of 
crustal accretion and melt differentiation. This variation in style of layer- 
ing and banding, and the diversity of lithologies, differs from the mid- 
ocean-ridge basalt (MORB)-like, southern portions of the Oman ophiolite, 
which has been used as a fast-spreading-ridge analogue**. The primary 
difference is that simple modal layering dominates and irregular band- 
ing is very rare in this region of Oman’. 


the Galapagos triple junction (TJ) in the eastern equatorial Pacific Ocean 
showing major tectonic boundaries. The white box indicates the location of the 
map in b. b, Regional bathymetric map of the HDR showing key morphological 
features and the locations of IODP Site U1415 and ODP Site 894. Maps show 
bathymetry derived from satellite altimetry data and archived multibeam 
bathymetry data available from the Global Multi-Resolution Topography Data 
Portal at the Lamont-Doherty Earth Observatory. 


The Site U1415 cores are much more primitive (high MgO and high 
Mg number, Mg# = 100Mg/(Mg + Fe)) than previously recovered 
samples of the overlying upper gabbros (Fig. 3). These new samples 
allow estimation of the bulk composition of fast-spreading oceanic 
crust because the thickness of the crust and its component parts can 
be estimated along with uncertainties from field relationships deter- 
mined from four previous submersible surveys'''*"* and ODP Leg 147 
drilling’* (Methods). The relative proportions of the upper crust (lavas 
and dykes form 22.5% + 4.5% of the crust) and shallow-level gabbros 
(32.5% + 7.5% of the plutonics) constrain the plausible fraction of 
deeper gabbros. For these calculations, we use the mean compositions 
of the upper crust (lavas and sheeted dykes) from the northern escarp- 
ment, the shallow-level gabbros from the northern escarpment and 
ODP Site 894, and the lower gabbros from IODP Site U1415 (see 
Fig. 1b for locations, and Extended Data Table 1). 

The calculated bulk composition of the HDR crust (Extended Data 
Table 1) contains 12.1 + 1.0 wt% MgO, 7.7 + 0.4 wt% FeO’ (all iron 
as FeO) and has a Mg number of 74 + 1, which falls at the Mg-rich end 
of the range of experimentally determined parental melts for MORBs”’. 
The calculated liquidus temperature is ~ 1,290-1,300 °C, and fractional 
crystallization models follow the expected MORB sequence of olivine 
(Fo ~ 90-91 (Fo = 100Mg/Mg + Fe)) followed by plagioclase (An ~ 84-87 
(An = 100Ca/Ca + Na)) then clinopyroxene (Mg# ~ 86-88), with ortho- 
pyroxene (Mg# ~ 75-82) saturation at ~1,165-1,180 °C (refs 22, 23) 
with 46-35% melt remaining” (Fig. 3 and Methods). The first olivine 
to be produced in these models has a similar Fo content to the olivine in 
dunites and harzburgites recovered from Hess Deep by previous drilling 
(Fo content, 89.4—91.3; refs 24, 25); this is consistent with the bulk-crustal 
composition being representative of a primary mantle melt extracted 
either directly from the harzburgites or through dunite channels’. 

An unexpected finding is that cumulus orthopyroxene commonly 
occurs in primitive (bulk-rock Mg#go_gs) plutonic rocks from the deep 
portion of the crust at the HDR (Fig. 2d). Orthopyroxene is a common 
minor cumulus phase and an early intercumulus phase (<5%) in oli- 
vine gabbro, gabbro, gabbronorite and troctolite in the layered gabbro 
series. In contrast, the virtual absence of orthopyroxene as a phenocryst 
in MORB globally (including HDR), as well as experimental studies of 
MORE differentiation (see, for example, ref. 8) and modelling of the 
differentiation using our bulk-crustal composition (Fig. 3), all indicate 
that orthopyroxene should not be a liquidus phase until >50% crystal- 
lization has occurred with a substantial interval of clinopyroxene 
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Figure 2 | Typical gabbroic rocks at Site U1415. a, Simple, centimetre-scale 
modal layers of alternating lithology (labelled to right of core) (Hole U1415I, section 
4R-1, 47-115 cm); layer boundaries (red dashed lines) are sharp and planar, and a 
layering-parallel foliation is seen throughout. Additional examples are given in 
Extended Data Fig. 2. b, Orthopyroxene-bearing olivine gabbro showing irregular, 
steeply dipping leucocratic and melanocratic bands that range from distinct to weak 
(Hole U1415P, section 8R-1, 86-144 cm). Banding is defined by modal, grain 

size and textural variations. Also of note is an orthopyroxene-rich band. 

c, Photomicrograph, under cross-polarized light, of a troctolite in the layered gabbro 
series showing a strong magmatic foliation indicated by the red arrow (sample 
U1415], section 4R-2, 0-4 cm, piece 2). d, Photomicrograph, under cross-polarized 
light, showing cumulus orthopyroxene in an orthopyroxene-bearing olivine gabbro 
(sample U1415P, section 4G-1, 5-7 cm, piece 2). Ol G, olivine gabbro; Tr, troctolite; 
GN, gabbronorite; Ol GN, olivine-bearing gabbronorite; Opx Ol G, orthopyroxene- 
bearing olivine gabbro; Ol, olivine; Opx, orthopyroxene; Pl, plagioclase. 


crystallization preceding orthopyroxene saturation. Such late-stage 
orthopyroxene is commonly found in more evolved gabbros, including 
the shallow-level gabbros from the HDR”. 

Orthopyroxene is ubiquitous in the upper mantle, where its coe- 
xistence with olivine buffers the silica activity in primary mantle melts. 
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Figure 3 | Variations in whole-rock CaO and Al,O; with MgO for different 
parts of the crust at the HDR. a, CaO; b, Al,O3. The lower plutonics recovered 
from Site U1415 are much more primitive than the shallow plutonics. 

The compositions of the lower-gabbro cumulates and lavas and dykes can be 
broadly explained using a simple fractional crystallization model (grey arrows) 
of the bulk-crustal composition, but the shallow gabbros clearly contain 
substantial trapped melt in the bulk composition (that is, they are mixtures of 
cumulate and melt compositions). The uncertainties for the bulk-crustal and 
plutonic-section compositions are smaller than their symbol size (Extended 
Data Table 1). Fractional crystallization trends (showing instantaneous 
compositions) for the melt (thick solid light-grey arrow) and cumulates 
(thick dashed dark-grey arrow) calculated using output of the PETROLOG 
program”’ schematically encompass the range of models considered 
(Methods). The first appearance of each mineral modelled is shown for the 
cumulate crystallization trend: olivine is the liquidus phase, plagioclase appears 
after ~7% crystallization, clinopyroxene appears after ~32% to 35% and 
orthopyroxene appears after ~55% to 65%. 


The expected late crystallization of orthopyroxene in MORB is due 
to the generation of MORB by means of polybaric, near-fractional 
melting’*”’, with an average melting pressure of about 10 kbar (ref. 27). 
Decompression of melts aggregated from throughout the melting 
column leads to an expansion of the olivine stability field and shrinking 
of the orthopyroxene stability field, leaving the low-pressure melt far 
from orthopyroxene saturation®. Although several processes could 
explain the occurrence of orthopyroxene in the deep primitive gabbros 
at the HDR, most seem unlikely. For example, the parental magmas could 
be more oxidized than typical MORB. This would lead to less of the Fe in 
the melt being divalent and, hence, available to partition into mafic 
phases”, and could also lead to early oxide saturation driving an increase 
in silica activity, both of which could lead to early orthopyroxene satura- 
tion. However, this model is difficult to reconcile with either the normal 
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differentiation trends observed in the overlying lavas and dykes, includ- 
ing Fe-enrichment trends”, or the virtual absence of FeTi oxides (typi- 
cally =0.1 modal per cent) in the Site U1415 rocks. Another possibility is 
that orthopyroxene saturation is influenced by the addition and removal 
of H,O from the system*. The low water content of primitive MORB*? 
and the observation that the Site U1415 cumulates contain almost no 
magmatic amphibole suggest very limited HO in the system, making this 
possibility unlikely. The most reasonable explanation is that orthopyrox- 
ene was precipitated from a primitive melt that had undergone little 
decompression since being in (major-element) equilibrium with shallow 
mantle orthopyroxene. This can be explained if this melt either was 
generated by shallow mantle melting, or re-equilibrated with the shallow 
mantle as it was transported through it, and crystallized within the crust 
without first mixing with aggregated MORB melts in the crust”. Partial 
re-equilibration of melt during transport through the shallow mantle is 
supported by the relatively high Sr content of the primitive cumulates, 
which suggests that their parental melt was not depleted in incompatible 
elements. If this model is correct, it indicates that diverse melt composi- 
tions feed the crust and that the lower crust acts as an efficient filter for 
mixing these before the eruption of their homogenized and differentiated 
products. 

Overall, our findings demonstrate that although the bulk oceanic 
crust at the HDR has a similar composition to that predicted for parental 
MORB, the diversity in parental melts added into the crust is greater 
than expected. Partial crystallization of these diverse melts occurs before 
mixing, something that is not considered in models of MORB dif- 
ferentiation. However, such melts are not erupted, indicating that melt 
transport through the lower crust acts as an efficient mechanism to 
homogenize the Moho-crossing melts. The heterogeneity in the lithol- 
ogies, bulk compositions, layering types and foliation strength observed 
within the Site U1415 core suggests complex melt differentiation and 
crustal accretion processes in the lower crust at fast-spreading ridges. 


METHODS SUMMARY 


The bulk composition of the EPR crust exposed at the HDR and its uncertainty 
were calculated using new data and published compositions and relative mass 
fractions of the main crustal lithologies: lavas and dykes, shallow-level gabbros, 
and deep-level gabbros (Extended Data Table 1). The mass fractions of the crustal 
lithologies and their uncertainties are derived from field observations'*. A series 
of models”* was used to investigate whether differentiation of a parental melt 
with the composition of the bulk crust would produce cumulates and residual 
melts of similar composition to the observed plutonics and upper crustal rocks at 
the HDR. A fuller description is given in Methods. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 

Calculation of the bulk composition of the HDR crust. Calculation of the bulk 
composition of the crust requires knowledge of the compositions and relative mass 
fractions of the crustal lithologies. We divide the crust into three sections: (i) the 
upper crust (lavas and dykes), (ii) shallow-level gabbros and (iii) deep-level gabbros. 
Seismic velocity-depth models for undisrupted EPR crust north of the HDR 
indicate a crustal thicknesses of ~5.6 km (G. Christeson, personal communication). 
Field relationships constrain the thickness of the upper crust to be ~ 1.25 + 0.25km 
(1s.d.; ref. 13), and the subjacent plutonic sequence is therefore 4.1-4.6 km thick. 
The respective volumes of the upper crust and plutonic sequence were converted to 
mass fractions on the basis of their density differences. 

The composition of the upper crust is based on an extensive sample suite 
collected by submersible along vertical transects through the lava and dyke sec- 
tions of the northern escarpment of the HDR between longitudes 101° 13.5’ and 
101° 28.5’ west (Fig. 1b). This representative data set includes whole-rock lava and 
dyke compositions*® (n = 157) and glass compositions*®** (n = 18). A series of test 
calculations were performed to assess how using either just the glasses or an 
average of the glass and whole-rock data impact the resulting bulk-crustal com- 
position; minimal difference in bulk-crustal composition was found and the mean 
of the two data sets was therefore used to define the bulk composition of the upper 
crust and its uncertainty. The bulk-crustal modelling assumes the upper crust 
comprises ~22.5% + 4.5% (1s.d.) of the total crustal thickness. 

The shallow-level gabbro compositions are also based on an extensive sample 
suite, including samples collected by submersible from the northern escarpment 
where the upper 1 km of the plutonic section is well exposed subjacent to the 
sheeted dyke complex, across a horizontal distance of 3 km (ref. 13); by drilling 
a 150-m-long section from the upper 1 km of the plutonic sequence at ODP Site 
894°; and by submersible from the western end of the intrarift ridge in the vicinity 
of ODP Site 894". The mean composition of the samples from the northern 
escarpment*** (n = 56), ODP Site 894'°*” (n = 76) and the western end of the 
intrarift ridge** (n = 7) was used to define the bulk composition of the shallow 
gabbros and its uncertainty. The proportion of the shallow-level gabbros in the 
plutonic section is estimated using field relationships from the HDR (see above) 
and the Oman ophiolite®’, which show them to comprise >20%-25% and 20%- 
50% of the plutonic section, respectively. The bulk-crustal modelling assumes 
that the shallow-level gabbros comprise ~32.5% + 7.5% (1s.d.) of the plutonic 
sequence. 

The lower-level gabbro compositions are calculated from the Site U1415 sam- 
ples, using the compositions of the layered gabbro series (n = 28) and troctolite 
series (n = 15). The proportions of the layered gabbro and troctolite series are 
approximately equal at IODP holes U1415J and U1415P (the two >100-m-deep 
drill holes), and we thus model their relative proportions when calculating the bulk 
crustal composition and its uncertainty as 50% + 20%. 

The mean compositions of the different sections of the crust calculated as 
explained above are given in Extended Data Table 1, along with the calculated 
bulk-crustal and bulk-plutonic-sequence compositions. The uncertainties in the 
mass fractions of each portion of the crust, along with the uncertainties in their 
average compositions, were propagated into the uncertainty in the bulk-crustal 
composition using a Monte Carlo error propagation assuming all errors are 
Gaussian. 


Modelling melt differentiation. A series of models was run to investigate whether 
differentiation of a parental melt with the composition of the bulk crust would 
produce cumulates and residual melts of similar composition to the observed 
plutonics and upper crustal rocks at the HDR. Both the MELTS” model and 
the PETROLOG” model were used to test how sensitive the results are to the 
model calibration. The models all assumed perfect fractional crystallization at 
1kbar and oxygen fugacities between the quartz—-magnetite-fayalite buffer and 
one log unit below this buffer. Although perfect fractional crystallization is 
unlikely in oceanic crust (see, for example, ref. 9 and references therein), compar- 
ison of these trends with observed compositions from the HDR provides a first- 
order test of whether the calculated bulk-crustal composition is an appropriate 
parental melt composition. Although the choice of model used has a small effect 
on the result, all models are broadly consistent in predicting a liquidus temperature 
of ~1,300 °C, a crystallization sequence of olivine — olivine + plagioclase > 
olivine + plagioclase + clinopyroxene, and that orthopyroxene is not saturated 
until after a substantial interval of cotectic olivine + plagioclase + clinopyroxene 
crystallization. The PETROLOG models used the model of ref. 40 to divide Fe into 
FeO and Fe20; on the basis of the defined fo, and the following combinations 
of mineral models: olivine, refs 41, 42; plagioclase, refs 27, 42; clinopyroxene, 
refs 27, 42; orthopyroxene, refs 43, 44. The model crystallization trends shown 
in Fig. 3 outline the compositional range for the melt and instantaneous cumulates 
with progressive fractional crystallization using these mineral models in various 
combinations. 
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Extended Data Figure 1 | Summary lithostratigraphic columns of the 
gabbroic rocks recovered at IODP holes. a, U1415I; b, U1415J; c, U1415P. 
Columns show recovery, lithological units, major rock types, dip of magmatic 
foliations and well-constrained magnetic remanence inclination values (mean 
and 1 s.d. listed). Lithological units were identified on the basis of similarities in 
rock types, magmatic textures and foliations. Palaeomagnetic remanence 
directions and the dip of the magmatic foliations and layers (not shown) for 
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Unit Ill: Troctolite Series 


units II and III in holes U1415J and U1415P are most easily interpreted as 
blocks that probably formed by slumping and were rotated relative to each 
other. Ghost cores (G cores) are intervals drilled during hole cleaning 
operations. In a, Unit II refers to the Unit II layer gabbro series. d, Map showing 
the relative locations of holes U1415I, U1415J and U1415P; microbathymetry 
from ref. 45. 
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Extended Data Figure 2 | Core images showing examples of simple, foliation. a, 345-U1415J-5R-2, piece 1, 2.0-17.5 cm; b, 345-U1415]-8R-2, 
centimetre-scale modal layering and a moderate-to-strong magmatic piece 9, 105.5-121.0 cm. 
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Extended Data Table 1 | Bulk compositions of crustal sections used to calculate the bulk-crustal composition and the bulk composition of the 
HDR crust and plutonic section 


SiO, TiO2 AlLOs FeO’ MnO MgO CaO NazO_ K,0 Total Mg#* on 
Upper mean 50.64 2.12 13.75 11.79 0.21 6.83 10.57 2.77 0.13 99.87 50.2 175 
crust s.e.m 0.05 0.03 0.06 0.12 0.00 0.05 0.07 0.02 0.01 0.05 0.00 
Shallow mean 51.22 1.15 15.05 9.09 0.17 8.09 12.02 2.41 0.05 99.32 61.8 139 
gabbros s.e.m. 0.15 0.09 0.13 0.22 0.00 0.12 0.13 0.04 0.00 0.07 0.01 
Lower gabbros’ 
Site U1415 mean 47.22 0.16 19.50 4.95 0.09 13.40 13.39 1.46 0.10 100.78 82.6 28 
Layered series s.e.m. 0.33 0.01 0.40 0.20 0.00 0.66 0.37 0.07 0.01 0.18 0.01 
Site U1415 mean 44.56 0.08 18.23 5.45 0.09 19.27 11.58 0.85 0.13 100.83 85.7 15 
Troctolite series s.e.m. 0.47 0.02 1.31 0.41 0.01 2.09 0.69 0.19 0.02 0.32 0.01 
Bulk crust 48.32 0.83 16.71 7.66 0.14 12.12 11.95 1.83 0.11 99.73 73.8 
composition s.d. 0.51 0.10 0.50 0.37 0.01 1.04 0.31 0.13 0.01 0.80 2.2 
Plutonic section 47.65 0.46 17.57 6.47 0.12 13.66 12.35 1.56 0.10 100.30 79.0 
bulk composition s.d. 0.63 0.08 0.60 0.36 0.01 1.28 0.39 0.15 0.01 1.03 0.02 


FeO’, all Fe as FeO. s.d., standard deviation; s.e.m., standard error on the mean. 
* Mg# = 100Mg/(Mg + Fe). 
+ Includes data for IODP holes U14151, U1415J and U1415P. 
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The effects of genetic variation on gene expression 
dynamics during development 


Mirko Francesconi!? & Ben Lehner>?? 


The development of a multicellular organism and physiological 
responses require massive coordinated changes in gene expression 
across several cell and tissue types'*. Polymorphic regions of the 
genome that influence gene expression levels have been identified 
by expression quantitative trait locus (EQTL) mapping in many 
species* °, including loci that have cell-dependent”®, tissue-dependent” 
and age-dependent” effects. However, there has been no comprehen- 
sive characterization of how polymorphisms influence the complex 
dynamic patterns of gene expression that occur during development 
and in physiology. Here we describe an efficient experimental design 
to infer gene expression dynamics from single expression profiles in 
different genotypes, and apply it to characterize the effect of local (cis) 
and distant (trans) genetic variation on gene expression at high tem- 
poral resolution throughout a 12-hour period of the development of 
Caenorhabditis elegans. Taking dynamic variation into account iden- 
tifies >50% more cis-eQTLs, including more than 900 that alter the 
dynamics of expression during this period. Local sequence poly- 
morphisms extensively affect the timing, rate, magnitude and shape 
of expression changes. Indeed, many local sequence variants both 
increase and decrease gene expression, depending on the time-point 
profiled. Expression dynamics during this 12-hour period are also 
influenced extensively in trans by distal loci. In particular, several 
trans loci influence genes with quite diverse dynamic expression pat- 
terns, but they do so primarily during a common time interval. Trans 
loci can therefore act as modifiers of expression during a particular 
period of development. This study provides the first characterization, 
to our knowledge, of the effect of local and distant genetic variation on 
the dynamics of gene expression throughout an extensive time period. 
Moreover, the approach developed here should facilitate the genetic 
dissection of other dynamic processes, including potentially develop- 
ment, physiology and disease progression in humans. 

To characterize the effects of genetic variation on gene expression 
dynamics, we developed an approach that reconstructs dynamic changes 
in gene expression from the static genome-wide expression profiles of 
several individual genotypes. For both technical and biological reasons, 
individual samples collected for gene expression profiling will normally 
not be perfectly synchronized'*"’. Rather than considering this a dis- 
advantage, our method exploits this variation to infer the dynamics of 
gene expression at high temporal resolution (Fig. 1). Our approach has 
two stages: first, we rank the samples according to their relative deve- 
lopmental progression inferred from their gene expression profiles 
(Fig. 1b). Second, we use multivariate dynamic time warping to derive 
the precise physiological time point of each sample, relative to a reference 
time series (Fig. 1c). The dependence of these dynamics on genetic 
variation in a population is then evaluated (see Methods). 

As a model process, we considered the late larval and early adult 
development of the nematode C. elegans, which represent a series of 
complex, multi-tissue developmental transitions that occur over many 
hours. Genome-wide gene expression measurements have been made 
on more than 200 recombinant inbred advanced intercross lines 
(RIAILs) derived from a cross between two divergent strains of 


C. elegans—the standard Bristol (N2) laboratory strain, and an isolate 
from Hawaii (CB4856), with each RIAIL genotyped at 1,455 markers’” 
(Fig. 1a). The genomes of the parental strains differ at a similar level of 
polymorphism as two humans”. As a reference gene expression time 
series, we used a data set consisting of 12 genome-wide expression 
profiles made every 3h and in triplicate from the mid-L3 larval stage 
to reproductive maturity in the Bristol strain’. 

To determine the relative physiological age of each sample, we used 
a multivariate technique, canonical correlation analysis, to search for 
mutually uncorrelated linear combinations of gene expression (cano- 
nical variates) in the reference time series’ that best explain (that is, are 
maximally correlated with) gene expression trends in the RIAIL data 
set. The first two canonical variates are highly enriched for oogenesis 
and spermatogenesis genes, respectively (Extended Data Fig. 1a), and 
the trajectory of the reference samples (Extended Data Fig. 1b) reveals 
that these germline expression signatures can be used to sort the RIAIL 
samples by their physiological age (Fig. 1b). 

After sorting the RIAILs, we next inferred their absolute physio- 
logical ages using a multivariate dynamic time-warping algorithm’ 
that searches for the optimal match between the profiles of the first 
six RIAIL canonical variates and those of the reference time series 
(Fig. 1c, d and Extended Data Fig. 1f-h). This reveals that the physio- 
logical ages of the 206 RIAIL samples are non-uniformly distributed 
over a 12-h time interval centred on the late L4 developmental stage 
(Fig. 1d, inset). 
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Figure 1 | Reconstructing dynamics from static expression profiles. 

a, Overview of the experimental design. b, Projection of the RIAILs on the 
first two canonical variates. c, Expression of the second canonical variate, 
representing spermatogenesis, in each RIAIL ranked according to age. 

d, Dynamic time warping onto the reference time series. Inset, distribution 
of the inferred physiological ages of the RIAILs. 
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Figure 2 | High-resolution view of gene expression during 12 h of C. elegans 
development. Hierarchically clustered expression of 15,855 genes in the 206 
RIAILs ordered by their inferred physiological age. Tissue-specific expression 
signatures enriched in each cluster are listed (tissues accounting for at least 
30% of annotated genes, and Fisher’s exact test P< 0.0134 that corresponds 
to FDR = 0.1). 


The 206 ordered RIAIL samples provide an unprecedentedly high 
temporal resolution picture of gene expression dynamics during this 
period. In total, 14,926 out of 15,855 expressed genes (94%) change 
in relative expression during this 12-h time interval (false discovery 
rate (FDR) = 0.1, Supplementary Table 1), with hierarchical clustering 
revealing a range of different dynamic patterns of expression deriving 
from genes expressed in different tissues within the animal (Fig. 2). For 
example, multiple time-shifted waves of gene expression are obser- 
ved from the hypodermal cells, reflecting the molting cycle, and even 
spermatogenesis-related expression falls into two temporal phases (Fig. 2). 
Intestine and neuronal expression can also be identified from the high- 
resolution temporal dynamics (Fig. 2). 

To analyse how genetic variation influences expression dynamics, 
we performed eQTL mapping®. We first focused on local regulatory 
variation. Without considering time, regression on local genetic mar- 
kers identified 2,958 genes with expression levels significantly affected 
by local sequence variation (cis-eQTLs, FDR = 0.1). However, adding 
time as a covariate yielded 4,246 cis-eQTLs, 44% more than when time 
is ignored. Moreover, 905 genes were found to have expression that is 
significantly better explained by a model that includes an interaction 
between local genetic variation and time than by an additive model 
(at FDR = 0.1). Indeed, for 300 genes, a significant cis-eQTL could 
only be detected when considering an interaction with time, resulting 
ina total of 4,546 cis-eQTLs (54% more than when time is ignored and 
methods that regress out hidden confounding factors'*"* have not been 
applied; Extended Data Fig. 1i). 

We classified the 905 ‘dynamic’ cis-eQTLs into five non-mutually ex- 
clusive classes (Fig. 3 and Extended Data Fig. 1j). A total of 174 cis-eQTLs 
affected the magnitude of an expression response, with the Hawaii allele, 
for example, showing stronger induction than the Bristol allele owing to 
local sequence variation (Fig. 3A, a), or with an increased amplitude of a 
periodic expression change (Fig. 3A, b). By contrast, for 275 genes it was 
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the timing of expression that was altered, with expression often showing 
faster dynamics for one of the two alleles (Fig. 3B). For 148 genes with 
monotonic dynamics, it was the rate of expression change that was alter- 
ed, with faster or slower induction of expression depending on whether 
the allele derived from the Bristol or Hawaii parental strain (Fig. 3C). 
Finally, in 245 cases, the situation was more complex, with the Bristol 
and Hawaii alleles showing a different shape of temporal expression 
dynamics during this 12-h period (Fig. 3D). 

Notably, for 230 genes, the effect of the cis-eQTL changed sign (that 
is, from positive to negative) one or more times even during this 
limited 12-h time window, meaning that the Hawaii allele would have 
been reported as increasing or decreasing expression depending on the 
time-point considered (Fig. 3E), similar to what has been observed 
when comparing between cell types’*’’. Thus, even during a relatively 
short time window, local genetic variation between two individuals can 
have diverse and complex effects on gene expression. 

As previously reported’, genes with significant cis-eQTLs have a 
higher level of local sequence polymorphism and they are enriched on 
the more polymorphic arms of C. elegans autosomes (Extended Data 
Fig. 1k). This is also true for genes with dynamic cis-eQTLs (Extended 
Data Fig. 1k). However, genes with different classes of dynamic cis-eQTLs 
differ in the density of polymorphisms in different genic regions (see d in 
Fig. 3, and Extended Data Fig. 1]). In particular, genes with a significant 
cis-eQTL altering the rate or the shape of expression have a higher level of 
polymorphism in their 5’ untranslated regions (UTRs) (Fig. 3C, D), sug- 
gesting that causal variants might be enriched in the 5’ UTRs and that 
changes in the rate of expression may be mediated by changes in post- 
transcriptional regulation. 

We next analysed the effect of distant (trans) genetic variation on 
the dynamics of gene expression using a multivariate approach, ran- 
dom forest regression'’, including physiological age as a covariate 
(see Methods). In total, we identified 3,164 significant trans-eQTLs 
(FDR = 0.1; Fig. 4A), indicating that the expression of a large number 
of genes is influenced by non-local polymorphisms between the Bristol 
and Hawaii genomes. 

We found that 773 of the genes with trans-eQTLs were influenced 
by variation in 10 trans hotspots (Fig. 4A, B), including three previ- 
ously identified loci’*. We clustered the expression of the genes influ- 
enced by each of these loci and analysed their tissue-specificity (Fig. 4 
and Extended Data Fig. 2). Two hotspots target hypodermally expressed 
genes (Fig. 4C, b and c, 37% of annotated genes, P = 1.55 X 10 * and 
39%, P=1%X 10°, respectively), one is specific for sperm-expressed 
genes (Fig. 4C, d, 80% of annotated genes, P = 1.39 x 10 °) and another 
for intestine-expressed genes (Extended Data Fig. 2d). The hotspot con- 
taining a known loss-of-function polymorphism in the neuropeptide 
receptor npr-1 in the Bristol strain targets neuronal- (28% of annotated 
genes, P = 1.06 X 10° °) and body-wall-muscle-expressed genes (28% of 
annotated genes, P = 6.28 X 10”) (Extended Data Fig. 2a). Thus, one 
set of trans-eQTL hotspots probably acts by influencing the develop- 
ment of individual cell types within the animal. 

Notably, however, although a hotspot may target a single tissue, the 
target genes can have diverse temporal dynamics. For example, the hot- 
spot in the middle of chromosome I alters the expression of genes ex- 
pressed in the hypodermis with three different dynamic patterns of 
expression (Fig. 4C, b). This promiscuity is also apparent for hotspots 
that influence expression in several tissues. For example, the hotspot on 
the left arm of chromosome V alters the expression of genes with five 
different dynamic patterns of expression, including genes that increase, 
decrease or oscillate in expression during this 12-h window (Fig. 4C, a). 
Only one of the top ten hotspots primarily influences the expression of 
genes with a single dynamic pattern (Fig. 4C, d). 

What connects the expression of genes expressed in different tissues 
and with different time trends that are influenced by a common trans 
hotspot? In some cases, at least, it seems that the hotspot region influences 
expression levels during a restricted time window. For example, the hot- 
spot on chromosome V primarily affects expression at the start of the 
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Figure 4| trans-eQTLs. A, Gene expression (y axis) significantly influenced 
by marker genotypes (x axis). B, The number of genes influenced in trans 
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Figure 3 | A total of 905 cis-eQTLs affect the 
dynamics of gene expression. A-F, Example cis- 
eQTLs that affect the magnitude (A), timing (B), 
rate (C), and shape (D) of expression changes, or 
that have opposite effects depending on the time 
point (D) and strictly additive effects (F). Bristol 
alleles are red, Hawaii alleles are blue, box plots 
(left) include all time points. A-F, d, Density of 
polymorphisms in different regions of the genes in 
each class. inter, intergenic region; SNP, single 
nucleotide polymorphism; syn, synonymous sites. 
P values calculated using Fisher’s exact test. Error 
bars represent 95% binomial confidence intervals. 


by each marker identify trans-eQTL hotspots, including three previously 


identified regions (asterisks). C, Left, clustered heat maps of the time-ordered 
expression of genes regulated by trans-eQTL hotspots; right, expression of the 
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means of each cluster (labelled i-v, as in the heat maps) when the hotspot 
carries the Bristol (red) or Hawaii (blue) allele. Three hotspots affecting 


expression during a limited time interval (pale blue background) are shown, 


Fig. 2. 


together with a hotspot specific for sperm expression. See also Extended Data 


12-h time interval, altering the expression of genes with quite diverse 
expression trends during this period (Fig. 4C, a). By contrast, the hotspot 
on chromosome I has the largest effects on expression slightly later in 
development (Fig. 4C, b), and the hotspot in the middle chromosome IV 
affects expression a few hours later in development (Fig. 4C, c). Thus, 
sequence variation in these regions alters the expression of multiple genes 
including those with different expression trends, but during a specific 
period of development (Fig. 4). 

In summary, we have performed a comprehensive analysis of how 
local and distant genetic variation influences the dynamics of gene ex- 
pression. Considering the dynamics of gene expression identified more 
eQTLs, but, more importantly, revealed that local sequence variation can 
have diverse and complex effects, altering the timing, rate, magnitude 
and shape of expression changes. Natural sequence variation also exten- 
sively alters gene expression dynamics in trans, with hotspot regions 
altering expression in individual tissues or across multiple tissues, often 
affecting genes expressed with quite diverse dynamic trends, and, in 
some cases, primarily influencing expression during a particular tem- 
poral period of development. 

The approach that we have used here represents an efficient experi- 
mental design, exploiting technical, biological or deliberate variation in 
timing to infer dynamics from a single static profile of expression in each 
genotype. We envisage, therefore, that it could be quite widely applied 
for the genetic analysis of other complex dynamic systems, including the 
potential to characterize the effect of natural variation on development, 
physiology and disease progression in humans and other economically 
important species. 


METHODS SUMMARY 


The physiological age of each RIAIL sample was estimated using a canonical cor- 
relation analysis of the RIAIL expression profiles and a reference gene expression 
time course followed by multivariate dynamic time warping on the pairs of canonical 
variates. Single nucleotide polymorphisms and insertions and deletions between the 
CB4856 and N2 genomes were obtained from the million mutation project’’. 
cis-eQTLs were detected using a linear model that includes RIAILs age as a covariate, 
while trans-eQTLs were detected using random forest regression including age as a 
covariate. Tissue specific gene expression was defined using published data”. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 

Recombinant inbred lines. The RIAILs derive from a cross between the N2 (Bristol) 
and CB4856 (Hawaii) strains of C. elegans, followed by ten generations of random pair 
mating and ten generations of selfing”’. Each RIAIL line was previously genotyped at 
1,455 SNP markers”. 

Gene expression profiles. The gene expression data set analysed here consists of 
microarray expression profiles for 208 RIAILs'*. Worms were synchronized by 
hypochlorite treatment” and reared at 20 °C. Animals were collected as soon as the 
population started to lay embryos. RNA was extracted using the trizol method and 
hybridized to Agilent 444k microarrays using standard protocols as described 
previously’. Gene expression data, batch and dye information were obtained from 
the Gene Expression Omnibus (GEO) database. Microarray probe annotations 
were obtained from Wormbase release WS220. 

Data pre-processing. Microarray probes matching multiple loci or pseudogenes 
according to WS220 were discarded. In addition, probes were discarded if they 
contained SNPs or insertions and deletions (indels) in CB4856 according to the 
sequencing data from the million mutation project’’, as were probes with more 
than 10 saturated values (value higher than 64,000) across the 208 RIAILs, or those 
not detected as expressed in at least 8 samples. The threshold for classifying a gene 
as expressed was defined using model cluster analysis (function Mclust in the R 
mclust library) on single channel log intensities. The lowest intensity cluster had a 
mean of ~3.65 and a standard deviation (s.d.) of ~0.03, so the threshold for ex- 
pression was defined at 3.6 + 5 s.d. = 3.8 log intensity. No background subtraction 
was applied and intensities within arrays were normalized using LOESS (using the 
function normalizeWithinArrays in the limma library in R). Next, the reference 
intensities where normalized among arrays using quantile normalization, ensuring 
that the reference channel has the same empirical distribution across all the arrays 
(function normalizeBetweenArrays, limma library, R methods Gquantile and 
Rquantile). Log intensities of different probes for the same genes were averaged. 
Principle component analysis (PCA) on the uncentred data matrix highlighted two 
outlier samples that were excluded from all further analyses (Extended Data Fig. 1c). 
Developmental time course. The developmental time course gene expression 
data set consists of gene expression profiles of pools of synchronized worms reared 
at 25 °C collected every 3h for a total of 12 time points in three replicates, encom- 
passing development from the middle of the L3 larval state to young adults’. Data 
were obtained from the SPELL database in log, of fold change format, each of the 
12 data points being the average across the three replicas. 

SNPs and indels. SNPs and indels between the CB4856 and N2 genomes were 
obtained from the million mutation project”. In total, there are 173,898 SNPs and 
5,515 indels between the two strains. 

Ranking samples by developmental progression: canonical correlation ana- 
lysis. Canonical correlation analysis was used to compare the gene expression 
profiles of the RIAILs to the expression profiles of the reference development time 
series. This analysis searches for uncorrelated trends in gene expression (called 
canonical variates) in one data set that best explain (that is, are maximally corre- 
lated with) gene expression trends in the other one. The cc function in the CCA 
library in R was used. RIAILs were sorted according to the angle formed by their 
loadings onto the first and second canonical variates transformed into a polar 
coordinate system. These projections correspond to the oogenesis and spermato- 
genesis gene expression trends, respectively. PCA performed on the RIAIL sam- 
ples further supported this conclusion that germline development is a major 
source of expression variance (Extended Data Fig. 1d, e). 

Estimating the developmental stage of each sample: multivariate dynamic 
time warping. After sorting the RIAILs according to their relative age, the absolute 
physiological age of each sampled was estimated by applying multivariate dynamic 
time warping’’ to the canonical variates extracted from the canonical correlation 
analysis. First, a smoothing spline was fitted to both the sorted RIAILs and to the 
reference time series projections on each of the first six canonical variates using the 
gam function in the gam R library anda smoothing parameter of 0.9 for the RIAILs 
canonical variates and and 0.25 for the reference ones. The difference is due to the 
fact that the RIAIL data are more variable and they need more smoothing to avoid 
over-fitting while the reference data points are already an average of three repli- 
cates. Next, the reference time series was interpolated with cubic splines using 700 
points using spline function in R. Finally, multivariate dynamic time warping was 
applied using the projection of the RIAIL expression splines onto the first six gene 
expression trends extracted by the canonical correlation analysis and using the 
developmental time series as reference. The dtw function from the dtw library in R 
was used with the open start, open end and asymmetric step options. 
cis-eQTLs. A focused single-marker approach” was applied to detect cis-eQTLs. 
Standard linear regression was used to test the significance of the closest marker on 
gene expression. The significance of a model including only the marker effect was 
tested against a null model using an F-test. P values from the F-test were adjusted 
for multiple testing, controlling the FDR using the Benjamini-Hockberg method. 


This yielded 2,958 genes significant at FDR = 0.1. Next, a time term was included 
in the model as a covariate to model time-dependent gene expression changes. In 
this term, nonlinear expression dynamics were modelled using natural cubic 
splines basis functions (function ns of the splines library in R) using 4 degrees 
of freedom for all genes, chosen by evaluating the quality of the fit on the first 6 
canonical variates of the canonical correlation analysis”. An F-test was used to test 
whether a model that included both the time term and the marker term was 
significantly better than a model that only includes the time term. This test yielded 
4,246 significant eQTLs at FDR = 0.1. To define eQTLs including a significant 
interaction between genotype and time, the F-test was used to test whether a linear 
model that includes the age by maker interaction is significantly better than a 
model that only includes time and marker additive effects. This test yielded 905 
eQTLs with significant time by marker interaction at FDR = 0.1. These results are 
robust to variation in time estimation (Extended Data Table 1a, b). eQTLs were 
classified as strictly additive if the raw P value for time by marker interaction was 
higher than 0.5 (n = 1,247). The 905 eQTLs with a significant time by marker 
interaction were further divided in 5 non-mutually exclusive classes by inspection 
of the time trends. cis-eQTLs that affected the magnitude of an expression res- 
ponse or with an increased amplitude of a periodic expression change were 
included in ‘magnitude’ class (n = 174). eQTLs that affect the timing of expres- 
sion, for example, where the gene expression of the two alleles peaks at different 
times, were assigned to the ‘timing’ class. For genes with a monotonic expression 
dynamic, if the eQTLs show a different rate of expression change (that is, a faster or 
slower induction) they were classified as ‘rate’ (n = 148). eQTLs where Bristol and 
Hawaii alleles show very different temporal expression dynamics that could not be 
obtained by a trivial transformation were assigned to the ‘shape’ class (n = 245). 
Wealso considered cases where eQTLs are characterized by a change of sign of the 
effect during the time window; that is, when one allele shows higher expression at 
one time points but weaker induction at another time point. These eQTLs were 
labelled as ‘inversions’ (n = 230). Forty eQTLs where one allele has just one outlier 
point at the extreme of the time window were considered as false positives and 
were not further considered, together with 43 more eQTLs where the difference 
between the two alleles is very small and difficult to interpret and so also probably 
represent false positives. Classifications of all the cis-eQTLs are given in Sup- 
plementary Table 2. 

trans-eQTLs. To identify trans-eQTLs, a genome-wide multivariate analysis was 
performed using random forest regression. Random forest is a powerful multi- 
variate machine learning technique based on ensembles of regression trees. This 
algorithm can model nonlinear trends with age, can include a large number of 
correlated predictors, is robust to over-fitting, and provides a variable importance 
measure that implicitly takes into account interactions among predictors’*. The 
random forest implementation provided in the randomForest library in R was 
used. After filtering out completely correlated markers that can degrade the per- 
formance of the random forest algorithm, 884 useful markers remain; that is, those 
that differ in at least one RIAIL. These markers, estimated age, batch and dye were 
included as predictors in the random forest models. The number of predictors to 
try at each bootstrap sample of the data (the ‘mtry’ parameter) was set to the 
maximum possible; that is, equal to the total number of predictors. This allows 
age and batch to always be included in the trees and it was empirically found to 
increase the performance compared to the default value of the mtry parameter. A 
forest of 1,000 trees was fitted on each gene and the mean decrease in prediction 
accuracy across the trees normalized by the standard deviation was used as a 
measure of variable importance. The random forest analysis was also run on 10 
permuted data sets, each one obtained by permuting the sample labels to retain the 
correlation structure in gene expression. The null distribution of variable import- 
ance of each predictor from these randomizations was used to calculate empirical 
P values for importance. The matrix of empirical P values was then corrected for 
multiple testing controlling the FDR (function p.adjust in R, method = ‘FDR’). 
This yields 5,803 eQTLs at 10% FDR. Extremely close eQTLs identified by random 
forests are likely to be linked to the same underlying causal variants. For this reason 
and to be conservative, hierarchical clustering of the genotype correlation matrix 
was used to group together highly correlated markers into 150 genomic intervals. 
For each gene, eQTLs mapping to the same interval were merged, keeping the 
eQTL with the highest variable importance score. This filtering leaves 4,976 
eQTLs. The same 150 genomic intervals were used to define local and distant 
eQTLs. An eQTL that lies in the same interval as the affected gene plus or minus 
500 kilobases was classified as cis. Using this definition leaves 3,164 trans eQTLs. 
Trans hotspots were defined using a Poisson test. According to this test, markers 
associated with nine or more eQTLs have a significantly higher number of eQTLs 
than expected by chance at FDR = 0.1. This yields 84 markers that are associated 
to 1,270 genes. Adjacent markers were merged into the same hotspot if they have a 
Pearson correlation coefficient of their genotypes across the RIAILs greater than 
0.95. This leaves 61 trans hotspots. Two small hotspots map to chromosome I close 
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to the zeel-1/peel-1 locus responsible for the genetic incompatibility between 
Bristol and Hawaii”. The two hotspots in this locus are supported by an extremely 
low number of Hawaii alleles and were not included in subsequent analyses. Genes 
in the trans hotspots were clustered using hierarchical clustering to highlight how 
changes in the gene expression dynamics associated to the hotspot. Close hotspots 
that show similar changes in expression dynamics are likely to share the same 
causal variants or combinations of causal variants and they were further merged 
into one. This is the case for two close hotspots on the left arm of chromosome I, 
for two close hotspots on the left arm of chromosome IV, two on the right arm of 
chromosome IV, three on the left arm of chromosome V, and for three hotspots on 
the left arm of chromosome X close to the npr-1. This leaves a total of 51 hotspots. 
The genes targeted by the top 10 trans-eQTL hotspots are listed in Supplementary 
Table 3. We also performed a single marker analysis (Extended Data Fig. 3) taking 
into account time by using a robust linear model fitting approach (function Imrob 
from robustbase package in R). The full model that includes time, marker and the 
interaction between marker and time was compared to a model that only includes 
time, using a likelihood ratio test (R function anova.lmrob with the test = ‘Deviance’ 
option). The analysis was repeated on ten permuted data sets and the number of 
genes showing at least one significant linkage in the permuted data sets was used to 
determine the FDR*”’. 

Tissue specificity of gene expression. Genes highly expressed in individuals 
tissues were defined using cell sorting or polyA-binding protein (PAB-1) immu- 
noprecipitation, as reported previously”. Embryonic cells expressing tissue-specific 
fluorescent reporter constructs were sorted by FACS analysis to define embryo spe- 
cific gene expression from the germline precursors, all neurons, BAG neurons, AVA 
neurons, AVE neurons, A-class motorneurons, dopaminergic and GABAergic neu- 
rons, body wall muscle, coelomocytes, hypodermis, intestine and pharyngeal muscle. 
A messenger-RNA-tagging strategy was used to isolate RNA from specific larval and 
adult cells: an epitope-tagged PAB-1 is expressed under control of cell-specific pro- 
moters and PAB-1-RNA complexes are immunoprecipitated. Larval-specific gene 
expression profiles from all neurons, A-class motor neurons, dopaminergic and 
glutamatergic neurons, PVD and OLL neurons, body wall muscle, coelomocytes, 
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hypodermis, intestine, pharyngeal muscle, excretory cells and sheath cells. In all cases, 
extracted RNA was amplified and hybridized to Affymetrix tiling arrays”. Genes were 
defined to be tissue enriched if they show a twofold enrichment versus a reference 
sample at FDR < 0.05 (ref. 20). Tissue specificity of the trans bands and expression 
clusters was determined using Fisher’s exact test, considering the genes that are 
annotated in at least one tissue. Among the 3,687 tissue-specific genes, 236 have a 
significant dynamic eQTL, which is moderately enriched compared to other genes 
(odds ratio = 1.17, P = 0.04). For each of these genes, we calculated the correlation 
coefficient between the spline fitted on the expression data for each of the two alleles 
and the centroids of the expression clusters in Fig. 2. We found that in 60% of cases 
both of the alleles belong to (that is, are maximally correlated with) the same cluster, 
and in another 15% of cases the two alleles belong to two distinct clusters but with the 
same tissue specificity. This suggests that, for at least 75% of these genes, the eQTLs 
cause changes in gene expression dynamics without changes in gene tissue specificity. 
Statistical analyses. All statistical analyses were performed in R. 
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Extended Data Figure 1 | Inferring dynamic expression from individual 
expression profiles in each genotype and analysing the effect of local 
regulatory variation on the dynamics. a, Genes related to oogenesis (red) and 
spermatogenesis (blue) have high scores on the first and second canonical 
variates, respectively. b, Canonical correlation analysis correctly sorts the 
samples in the reference expression time series. Reference data points are sorted 
by time along a trajectory that mirrors the one formed by RIAILs (Fig. 1) when 
projected onto the first two canonical variates. Numbers indicate the time in 
hours after the mid-L3 stage at which each reference sample was prepared. 

c, PCA on the uncentred RIAIL expression data reveals two outliers that are 
excluded from further analysis. d, PCA analysis of RIAIL gene expression 
profiles. Projection of the RIAILs onto first two forms a nonlinear trajectory. 
The first and second components (comp) explain 37% and 20% of the variance 
in the RIAILs expression data, respectively. e, Genes related to oogenesis (red) 
and spermatogenesis (blue) have high scores on the first and second principle 
components, respectively. f, Projections of the 206 RIAILs ranked by canonical 
correlation onto the first six canonical variates (i-vi). g, Projections of the 
reference time series onto the same canonical variates. Dynamic time warping 
of the RIAIL projections onto the reference projections is used to estimate the 
physiological age of each RIAIL. h, Projections of the 206 RIAILs onto the first 


six canonical variates (y axis) versus the final estimated age of each RIAIL 

(x axis). i, The number of eQTLs detected by a model that only includes local 
marker without considering time (no time), by a model that includes time and 
local marker additive effects (additive), and the number of ‘dynamic’ eQTLs; 
that is, the ones that are best explained by a model that also includes the 
interaction between local marker and time. j, Classification of the ‘dynamic’ 
eQTLs into non-mutually exclusive classes. k, Genes with eQTLs are in general 
biased towards the arms and tips of the chromosomes; the same is also true for 
genes with ‘dynamic’ eQTLs. Error bars represent 95% binomial confidence 
intervals. 1, Density of polymorphisms in different regions of all genes, genes 
with no detected eQLTs, genes with any kind of eQTLs, and genes with 
‘dynamic’ eQTLs. Intergenic regions are defined as regions between adjacent 
coding transcripts. 5’ UTR is defined as the region between the transcript start 
and the translation start site; 3’ UTR is the region between the stop codon and 
the transcript end. SNP density is defined as the number of SNPs divided by the 
length of the region of interest. Thus, non-synonymous and synonymous SNP 
densities are both defined as number of non-synonymous and synonymous 
SNPs divided by exon length. P values for enrichments were calculated using 
Fisher’s exact test. Error bars represent 95% binomial confidence intervals. 
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dynamics. Left, clustered heat maps of the time-ordered expression of genes genomic region carries the Bristol (red) or Hawaii (blue) allele. The numbering of 
regulated by different trans-eQTL hotspots in RIAILs carrying the Bristol (red) _ each hotspot follows that in Fig. 4B. Three trans-eQTL hotspots (numbers 5, 6 and 
or Hawaii (blue) alleles of the trans hotspot. Right, temporal expression profiles 7) have been previously identified. 
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Extended Data Figure 3 | Single maker analysis of trans-eQTLs. a, The 
number of target genes per marker is shown for the random forest (black) and 
single marker analysis (red). Trans hotspots were defined using a Poisson test. 
According to this test, markers associated with 13 or more eQTLs have a 
significantly higher number of eQTLs than expected by chance at FDR = 0.1 in 
the single marker analysis, and so were defined as trans hotspots. Of the top 10 
hotspots identified by random forest and presented in the manuscript, 9 are also 
identified by the single marker analysis. The missing hotspot is number 3 in 
the middle of chromosome IV, however 21 out of 55 genes (38%) regulated by 
this hotspot have at least one eQTL in the single marker analysis, and 11 of these 
are in the same region of chromosome IV as the hotspot, consistent with the 
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random forest analysis. b, To investigate further why the random forest detects 
more genes as regulated by this hotspot, we calculated the genome-wide 
correlation between the empirical P values (from the random forest analysis) of 
the hotspot 3 marker and all other markers. The hotspot 2 marker (red dashed 
line) has the highest correlation aside from markers that are in linkage 
disequilibrium with the hotspot 3 marker. c, Marker log odds ratio (LOD) 
scores of genes in trans hotspot 3 are significantly higher when the genotype of 
the hotspot 2 marker is included in a linear model. This may contribute to why 
more genes are detected as regulated by hotspot 3 in the random forest—in 
which the influence of several markers is taken into account—than in the single 
marker analysis. 
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Extended Data Table 1 | Robustness of the results to variation in time estimation 


a Canonical b Age dynamic 

variates (CV) mean min max estimation expression Additive eQTL_| Dynamic eQTL 
relative age 

1 and 2 8.98 2.31 15.72 only 14832(99.4%) 4099 (96.6%) | 710 (78.5%) 

1to3 9.96 2.69 16.14 1and2Cv 14867 (99.6%) 4146 (97.6%) 793 (87.7%) 

fea 10.46 4.48 16.1 1to5CV 14892 (99.8%) 4181 (98.2%) | 832 (92%) 
1to7 CV 14894 (99.8%) 4177 (98.1% 839 (92.8%) 

1to5 10.49 4.39 16.66 
1to12CV | 14885 (99.7%) 4147 (97.4%) | 824 (91.5%) 

1to6 10.54 4.58 16.71 

1to7 10.51 4.58 16.62 

1to8 10.51 4.58 16.62 

1to9 10.55 4.58 16.43 

1to 10 10.57 4.58 16.43 

1to11 10.61 4.72 16.43 

1to12 10.61 4.72 16.43 


a, Age estimation stability. Age estimation was performed using multivariate dynamic time warping and including an increasing number of canonical variates (CV) up to the maximum number of 12. The minimum, 
maximum and mean estimated age (in hours after mid-L3 stage) are shown. We considered the estimation to have stabilized after adding 6 canonical variates. b, Robustness of the cis-eQTL analysis to variation in 
age estimation. Shown are the number (and percentage) of genes that significantly change with age, with a significant additive cis-eQTL and significant dynamic cis-eQTL that are recovered from the original 
analysis (that uses 6 canonical variates to estimate age) when varying the age estimation. In the worst case, which occurs using only the relative age, we still recover almost 80% of the genes with dynamic eQTLs. 
Using one less or one more canonical variate in age estimation recovers well above 90% of the genes with dynamic eQTLs in the original analysis. 
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HMGA2 functions as a competing endogenous RNA to 
promote lung cancer progression 


Madhu S. Kumar', Elena Armenteros-Monterroso!, Philip East?, Probir Chakravorty”, Nik Matthews’, Monte M. Winslow‘ 


& Julian Downward!* 


Non-small-cell lung cancer (NSCLC) is the most prevalent histolo- 
gical cancer subtype worldwide’. As the majority of patients pre- 
sent with invasive, metastatic disease’, it is vital to understand the 
basis for lung cancer progression. Hmga2 is highly expressed in 
metastatic lung adenocarcinoma, in which it contributes to cancer 
progression and metastasis* *. Here we show that Hmga2 promotes 
lung cancer progression in mouse and human cells by operating as 
a competing endogenous RNA (ceRNA)”"’ for the let-7 microRNA 
(miRNA) family. Hmga2 can promote the transformation of lung 
cancer cells independent of protein-coding function but dependent 
upon the presence of let-7 sites; this occurs without changes in the 
levels of let-7 isoforms, suggesting that Hmga2 affects let-7 activity 
by altering miRNA targeting. These effects are also observed in vivo, 
where Hmga2 ceRNA activity drives lung cancer growth, invasion 
and dissemination. Integrated analysis of miRNA target prediction 
algorithms and metastatic lung cancer gene expression data reveals 
the TGF-B co-receptor Tgfbr3 (ref. 12) as a putative target of Hmga2 
ceRNA function. Tgfbr3 expression is regulated by the Hmga2 ceRNA 
through differential recruitment to Argonaute 2 (Ago2), and TGF-B 
signalling driven by Tgfbr3 is important for Hmga2 to promote 
lung cancer progression. Finally, analysis of NSCLC-patient gene- 
expression data reveals that HMGA2 and TGFBR3 are coordinately 
regulated in NSCLC-patient material, a vital corollary to ceRNA 
function. Taken together, these results suggest that Hmga2 pro- 
motes lung carcinogenesis both as a protein-coding gene and as a 
non-coding RNA; such dual-function regulation of gene-expression 
networks reflects a novel means by which oncogenes promote dis- 
ease progression. 

The ceRNA hypothesis posits that specific RNAs can function as 
sinks for pools of active miRNAs, functionally liberating other tran- 
scripts targeted by that set of miRNAs’*. Downregulation of the tran- 
scription factor Nkx2.1 promotes lung adenocarcinoma progression 
partially through derepression of Hmga2 (ref. 6), a non-histone chro- 
mosomal high-mobility group protein. Intriguingly, Hmga2 has been 
described as a prototypic let-7 target transcript, with seven conserved 
sites in its 3’ untranslated region (3' UTR)’*. Reduction of Hmga2 by 
RNA interference, which would deplete both Hmga2 protein and 
transcript, greatly reduces metastatic ability. Thus, it is possible that 
the transcript could operate independently of the protein in lung can- 
cer progression. 

To determine whether Hmga2 can operate as a ceRNA for the let-7 
family, we generated an allelic series of Hmga2 expression constructs 
(Fig. 1a). In this series, we expressed the wild-type full-length Hmga2 
complementary DNA (WT); Hmga2 with mutation of all seven pre- 
dicted let-7 binding sites'’ (m7); Hmga2 with mutation of the single in- 
frame start codon (ATG WT); or Hmga2 with mutation of both the 
start codon and the let-7 binding sites (ATG m7). We then examined 
these constructs in two lung cancer cell lines generated from the 


Kras'S’-G!2P, Trp53" ox/flox rouge model: a cell line derived from a 
non-metastatic lung tumour that expresses very low levels of Hmga2 
(368T 1); and a cell line derived from a lymph node metastasis which 
expresses high levels of Hmga2 (482N1)°. Using two antibodies that 
recognize either the amino-terminus or the second AT-hook of the 
protein (M. Narita, personal communication), we found that the Hmga2 
WT and m7 constructs efficiently express full-length Hmga2 protein 
(m7 overexpresses Hmga2 owing to loss of let-7-mediated suppres- 
sion), whereas the Hmga2 ATG WT and ATG m/7 constructs do not 
(Fig. 1b). Importantly, we observed similar levels of Hmga2 transcript 
expressed in the allelic series (in the case of the 482N1 cell line, the 
allelic series was mutated to abrogate binding to a short hairpin RNA 
(shRNA) against Hmga2) (Fig. 1c). Moreover, expression of the allelic 
series has no effect on the expression of various let-7 family members 
(Extended Data Fig. 1a). Taken together, this Hmga2 allelic series allows 
us to compare specifically the roles of Hmga2 protein and transcript 
function on lung cell transformation. 

We therefore compared the ability of the Hmga2 allelic series to 
promote anchorage-independent growth of the lung cancer cell lines. 
We observed a striking promotion of soft-agar growth by both Hmga2 
WT and ATG WT in the 368T1 and 482N1 cells (Figs. 1d, e); more 
modest growth was observed with Hmga2 m7, despite elevated protein 
expression relative to Hmga2 WT, and no growth was provided by 
Hmga2 ATG m7. This effect can be observed further in two additional 
human lung cancer cells (H1299 and SK-MES-1), as suppression of 
soft-agar growth by HMGA2 depletion can be rescued robustly by 
Hmga2 WT and ATG WT but more modestly by Hmga2 m7 (Extended 
Data Fig. 1b-e). Importantly, exogenous expression of let-7 reversed 
the ability of the Hmga2 ceRNA to promote anchorage-independent 
growth, suggesting that let-7 regulates this effect (Extended Data 
Fig. 2b). To demonstrate that the effect of the Hmga2 ceRNA is driven 
by let-7 sites in the 3’ UTR, we expressed only the wild-type or let-7- 
site-mutated 3’ UTRs in 368T1 cells and examined the consequences 
on anchorage-independent growth. Notably, expression of the wild- 
type but not let-7-mutant 3’ UTR was sufficient to promote soft-agar 
growth in 368T1 cells (Extended Data Fig. 2c). 

Beyond direct Hmga2 depletion, the Hmga2 WT and Hmga2 ATG 
WT constructs substantially rescued anchorage-independent growth 
in 482N1 cells stably overexpressing Nkx2.1, which we have previously 
shown to suppress lung cancer progression® (Extended Data Fig. 2d). 
Notably, this effect is not due to a general proliferative benefit of Hmga2 
WT and ATG WT cells, as BrdU (bromodeoxyuridine; 5-bromo-2'- 
deoxyuridine) incorporation in adherent conditions was comparable 
across the allelic series (Extended Data Fig. 3a). However, when lung 
cancer cells are placed in suspension, Hmga2 depletion suppressed 
proliferation and this proliferation was rescued substantially by Hmga2 
WT and ATG WT, and more marginally by Hmga2 m7 (Extended 
Data Fig. 3b); in contrast, the rate of apoptosis in the allelic series was 
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Figure 1 | Hmga2 promotes lung cancer cell transformation in a protein- 
coding-independent but let-7-site-dependent manner. a, Diagram of Hmga2 
allelic series: expression constructs containing the entire Hmga2 cDNA (‘WT’); 
the cDNA with all seven let-7 sites in the 3’ UTR mutated (‘m7’); the cDNA 
with the start codon mutated ((ATG WT’); and the cDNA with both the start 
codon and let-7 sites mutated ((ATG m7’). ORF, open reading frame. Xs 
represent mutations of the let-7 sites in the Hmga2 3’ UTR. b, Hmga2 WT and 
m7 induce Hmga?2 expression in non-metastatic lung cancer cells (368T1) and 
restore expression in metastatic lung cancer cells (482N1) depleted for 
endogenous Hmga2 (shHmga2). Two distinct HMGA2 antibodies are used: 
one recognizes the N-terminus of HMGA2 (HMGA2-CST) and the other 
recognizes the central AT-hook region of HMGA2 (HMGA2-Narita). 

c, Hmga2 RNA is comparably expressed by the WT, m7, ATG WT, and ATG 
m7 constructs in both 368T1 and 482N]1 cells. Hmga2 expression is normalized 
to Gapdh. 368T1 values are normalized to empty and 482N1 values are 
normalized to shluc empty (which express a shRNA targeting luciferase and the 
empty expression vector). Values are technical triplicates, have been performed 
independently three times, and represent mean + standard deviation (s.d.) 
with propagated error. d, Hmga2 WT and ATG WT promote substantial 
anchorage-independent growth in both 368T1 and 482N1 cells. Values are 
technical triplicates, have been performed independently three times, and 
represent mean = s.d. e, Representative images of soft agar colonies. 
Magnification is X10. ***P < 0.0005; **P < 0.005; *P < 0.05. 


not affected by growth in suspension (data not shown). Taken together, 
these results suggest that the Hmga2 transcript functions in a largely 
protein-coding-independent but let-7-site-dependent manner to pro- 
mote lung cancer cell transformation in vitro. 

To examine the effect of Hmga2 ceRNA activity on lung cancer cell 
dissemination in vivo, we intravenously transplanted 482N1 cells 
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expressing either the control shRNA and expression construct or the 
shRNA targeting Hmga2 plus the Hmga2 allelic series into syngeneic 
mice. As seen in Fig. 2a, micro-computed tomography (micro-CT) 
analysis revealed a substantial rescue of lung tumour formation with 
expression of either the Hmga2 WT or ATG WT constructs (with more 
modest effects with the Hmga2 m7 construct). Histopathological ana- 
lysis and quantification of surface lesions confirmed the effects of 
Hmga2 ceRNA activity on in vivo lung tumorigenesis (Figs. 2b, c). 
Moreover, the control shRNA, the shHmga2 plus Hmga2 WT and 
the shHmga2 plus Hmga2 ATG WT transplants generate a highly 
metastatic disease, with lesions disseminating to both local and distant 
lymph nodes, kidney, and the abdominal and thoracic cavities (data 
not shown). We also examined the effect of Hmga2 ceRNA function on 
survival in the transplant system. We observed a dramatic reduction in 
survival in animals transplanted with the Hmga2 WT and ATG WT 
cells, similar to that seen with transplant of the control shRNA cells 
(Fig. 2d); we noted a more modest reduction in survival with transplant 
of the Hmga2 m7 cells. In total, these findings indicate that Hmga2 
competing RNA activity dramatically promotes in vivo lung cancer 
formation. 

To elucidate the mechanism of Hmga2 ceRNA function on lung 
carcinogenesis, we analysed the set of genes differentially expressed 
between metastatic and non-metastatic Kras*'””;Trp53 ‘~ lung can- 
cer cells®° and compared them to the list of predicted let-7 target genes 
based upon the miRNA target prediction algorithm TargetScan™ (Sup- 
plementary Table 5). Kras was not a candidate in this analysis, in spite 
of previous description of Kras as an important let-7 target’®. Moreover, 
the Hmgaz2 allelic series had no impact on either expression of K-Ras 
protein or activity of downstream K-Ras signalling pathways (Extended 
Data Fig. 3c). In contrast, we observed several known Hmga2 tran- 
scriptional targets, including components of the Igf2bp family"®, vali- 
dating this approach. 

To elucidate more broadly which transcripts are Hmga2 ceRNA 
targets, we initially examined whether let-7 sites are enriched among 
transcripts induced by the Hmga2 ceRNA through RNA-seq of the 
482N1 allelic series combined with Sylamer analysis, which detects 
miRNA seed sites as nucleotide strings enriched within the 3’ UTRs 
of transcripts’”. We first compared control to Hmga2-knockdown cells 
and observed a specific enrichment of let-7 sites lost with Hmga2 
depletion (Extended Data Fig. 4a). We then determined whether this 
was specific to ceRNA activity by determining let-7-site enrichment 
upon re-expression of either the Hmga2 WT or ATG WT constructs in 
Hmga2-knockdown cells. In both conditions, let-7 sites were enriched 
among the upregulated transcripts (Extended Data Fig. 4b, c) Impor- 
tantly, let-7 sites were not enriched with re-expression of either Hmga2 
m7 or ATG m7 in the Hmga2-depleted background (Extended Data 
Fig. 4d, e). Moreover, analysis of fragments per kilobase of exon per 
million fragments mapped (FPKM) in the RNA sequencing (RNA- 
seq) results from control 482N1 cells showed Hmga2 was among the 
most highly expressed predicted let-7 target transcripts, suggesting 
that Hmga2 constitutes a physiologically germane fraction of the let-7 
target milieu (Supplementary Table 6). Taken together, these results 
indicate that the Hmga2 ceRNA broadly regulates let-7 targets. 

To assess Hmga2 ceRNA targets more specifically, we examined 
which transcripts were suppressed in response to Hmga2 depletion; 
13 out of 34 predicted targets were suppressed by Hmga2 knockdown 
(Extended Data Fig. 5a). To delineate which of these were Hmga2 
transcriptional targets versus ceRNA targets, we re-expressed either 
Hmga2 WT or ATG WT in knockdown cells. As seen in Extended 
Data Fig. 5b, 6 out of 13 transcripts were rescued by both Hmga2 WT 
and ATG WT, suggesting they are putative ceRNA targets; the remain- 
ing targets were rescued only by Hmga2 WT, suggesting they are 
targets of Hmga2 transcription factor function. These Hmga2 ceRNA 
targets were markedly enriched in let-7-regulated transcripts, as their 
repression by Hmga2 loss could be reversed with the use of a ‘tough 
decoy’ let-7 sponge transcript, designed to be an efficient and long-term 
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Figure 2 | Hmga2 ceRNA activity enhances lung cancer progression in vivo. 
a, Hmga2 WT and ATG WT restore lung tumour growth in response to 
endogenous Hmga2 knockdown. B6129SF1/Tac males were intravenously 
injected with 482N1 cells expressing either a control shRNA and empty vector 
(shluc empty) or shHmga2 with the Hmgaz2 allelic series. Three weeks 
afterwards, animals were scanned by micro-CT and representative transverse 
images are shown. The heart is demarcated (“H’) and white arrows identify lung 
tumours. b, Representative histological images of lungs transplanted with 
482N1 cells from the series described in a. Magnification is 1. c, Lung surface 
tumour counts were taken from animals transplanted with 482N1 cells from 
the series described in a (n = 3 animals per group). Values are technical 


suppressor of miRNA function (Extended Data Fig. 5c)'*. Conversely, 
overexpression of let-7 suppressed these transcripts, although Hmga2 
transcriptional targets were also affected owing to depletion of Hmga2 
(Extended Data Fig. 5d). Taken together, these studies outline a collec- 
tion of putative target transcripts regulated by Hmga2 ceRNA function. 

Among these six Hmga2 ceRNA target transcripts, we found the 
TGF-B co-receptor Tgfbr3 (ref. 12) to be both upregulated in meta- 
static lung cancer cells and a putative let-7 target. Furthermore, several 
Hmga2 ceRNA targets have been described as targets of TGF-B sig- 
nalling’’. Thus, we examined whether Hmga2 exerts ceRNA function 
through enhanced TGF- signalling via Tgfbr3. Consistent with this, 
we found that in both 368T1 and 482N1 cells, Hmga2 WT and ATG 
WT promote the expression of Tgfbr3 protein (Fig. 3a). This Tgfbr3 
upregulation also occurs to a lesser degree at the messenger RNA level, 
as has been described previously for miRNA targets” (Fig. 3b). Moreover, 
exogenous expression of let-7 reversed the ability of the Hmga2 ceRNA 
to upregulate Tgfbr3, suggesting that this effect is controlled by let-7 
(Extended Data Fig. 2a). An important consideration in ceRNA~target 
analysis is the absolute levels of Hmga2, Tgfbr3 and let-7 transcripts in 
cells, so we determined the copies per cell of these factors (Extended 
Data Fig. 4f). We observed that Hmga2 and Tgfbr3 had similar levels of 
transcript, as might be expected for two factors that can titrate express- 
ion of one another; similar results were observed in FPKM analysis of 
these transcripts in control 482N1 cells by RNA-seq (Supplementary 
Table 6). Furthermore, total let-7 family expression was within an 
order of magnitude of Hmga2 and Tgfbr3. As this pool of let-7 regu- 
lates the entire target set, it is possible that miRNA occupancy to be a 
limiting factor, allowing for the contribution of a ceRNA-like Hmga2. 
Taken together, these results suggest that Hmga2 may regulate Tgfbr3 
expression as a let-7 ceRNA. 

In line with these observations of Hmga2 promoting Tgfbr3 express- 
ion, Hmga2 WT and ATG WT activated TGF-B signalling through 


214 | NATURE | VOL 505 | 9 JANUARY 2014 
©2014 Macmillan Publishers 


> ny 
shHmga2 WT 
. | 


Surface tumours 


ATG WT | ATG m7 


shHmga2 


100 


50 


Per cent survival 


20 40 
Time (days) 


60 


— shluc empty **** — shHmga2 m7 *** 


— shHmga2 empty — shHmga2 ATG WT **** 


~— shHmga2 WT **** — shHmga2 ATG m7 NS 


triplicates and represent mean + s.e.m. d, Hmga2 WT and ATG WT 
substantially reduce survival of animals transplanted with 482N1 cells 
expressing the shRNA targeting Hmga2. Animals were intravenously 
transplanted with cells as in a. Animals were subsequently aged for survival and 
a Kaplan-Meier analysis was performed (n = 9 animals per group). Median 
survival was 34 days for shluc empty and shHmga2 WT transplants; 37 days 
for shHmga2 ATG WT transplants; 43 days for shHmga2 m7 transplants; 
and 50 days for shHmga2 empty and ATG m7 transplants. Statistical 
significance was assessed by log-rank tests compared to shHmga2 empty. 
** P <0.00005; ***P < 0.0005; **P < 0.005; *P < 0.05; NS, not significant. 


phosphorylation of Smad2 (Fig. 3a). This effect was let-7-dependent, as 
exogenous let-7 suppressed Smad2 phosphorylation (Extended Data 
Fig. 2a). It is likely that the TGF-f pathway is active in the absence of 
exogenous ligand owing to low but detectable levels of TGF-B in serum 
during cell culture’. We further examined whether Hmga2 ceRNA 
function affects TGF-B pathway activation by two methods. First, we 
found that a TGF-B reporter (CAGA12)” was potently induced by 
Hmga2 WT and ATG WT (Extended Data Fig. 5e). Second, analysis 
of TGF-B target transcript levels revealed specific expression of these 
genes with the Hmga2 WT and ATG WT constructs (Extended Data 
Fig. 5f). Notably, we observed little activation of the TGF-B pathway by 
Hmga2 m7, despite previous reports of Hmga2 functioning as a co- 
activator for Smad2, Smad3 and Smad4 in the epithelial-mesenchymal 
transition (EMT)”; this is likely to be due to the lack of upstream 
activation of the pathway. Consistent with this, the Hmga2 ceRNA 
does not induce an EMT in either 368T1 or 482N1 cells (Extended 
Data Fig. 6a). Overall, these results indicate that the Hmga2 ceRNA 
induces expression of Tgfbr3 and potentiates TGF-f signalling. 

To determine whether the effects of Hmga2 on Tgfbr3 occur through 
let-7-mediated derepression, we first examined the effect of Hmga2 
ceRNA function on a reporter containing the Tgfbr3 3’ UTR. In both 
368T1 and 482N1 cells, Hmga2 WT and ATG WT induced expression 
of luciferase under the control of the wild-type Tgfbr3 3’ UTR, but not 
if the let-7 site was mutated (Fig. 3c). Furthermore, we found that 
the effect of the Hmga2 ceRNA was broadly miRNA-dependent, as 
Hmga2 WT and ATG WT induced the reporter expression in Dicer1- 
intact sarcoma cells, but not in a Dicerl-null derivative cell line”* 
(Extended Data Fig. 6b). To assess directly whether Hmga2 induces 
Tgfbr3 through competition away from Ago2, we performed RNA 
immunoprecipitation (RIP) on Ago2 in lung cancer cells expressing 
the Hmga2 allelic series. As shown in Fig. 3d, we found that Hmga2 
WT and ATG WT were recruited to Ago2 at levels comparable to the 
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Figure 3 | Hmga2 ceRNA activity enhances TGE-f signalling through 
overexpression of Tgfbr3. a, Hmga2 WT and ATG WT substantially induce 
both Tgfbr3 protein expression and phosphorylation of Smad2 (pSmad2) in 
both 368T1 and 482N1 cells. b, Hmga2 WT and ATG WT significantly 
promote expression of Tgfbr3 mRNA in both 368T1 and 482N1 cells. Tgfbr3 
expression is normalized to Gapdh. 368T 1 values are normalized to empty and 
482N1 values are normalized to shluc empty. Values are technical triplicates, 
have been performed independently three times, and represent mean + s.d. 
with propagated error. c, Hmga2 WT and ATG WT specifically induce 
expression of a luciferase Tgfbr3 3’ UTR reporter in a let-7-site-dependent 
manner in both 368T1 and 482N1 cells. Cells were transfected with Renilla 
constructs of the control siCXCR4 multimer” and either the Tgfbr3 wild-type 
or let-7-mutant 3’ UTR reporter. Values are normalized to co-transfected pGL3 
plasmid. 368T1 values are normalized to empty and 482N1 values are 
normalized to shluc empty. Values are technical triplicates, have been 
performed independently three times, and represent mean + s.d. with 
propagated error. d, Hmga2 WT and ATG WT displace Tgfbr3 from Ago2- 
based RNA-induced silencing complexes. Lysates from 368T1 and 482N1 cells 
of the Hmgaz2 allelic series underwent either control immunoprecipitation 
(IgG) or immunoprecipitation for Ago2. RNA was purified and qRT-PCR was 


recruitment of endogenous Hmga2, whereas Hmga2 m7 and ATG m7 
were not. Moreover, Hmga2 WT and ATG WT cells had a substantial 
decrease in Tgfbr3 recruitment to Ago2. Of note, these effects on Ago2 
occupancy by Tgfbr3 are not caused by a change in let-7 activity, as 
various let-7 family members were comparably loaded on Ago2 across 
the Hmgaz2 allelic series (Extended Data Fig. 7). Thus, these results 
demonstrate Hmga2, through its let-7-binding sites, displaces Tgfbr3 
from miRNA-mediated repression by Ago2. In total, these results suggest 


performed for Hmga2 and Tgfbr3 on both the immunoprecipitated and input 
RNAs. Values are depicted as the percentage of input RNA, are technical 
triplicates, have been performed independently twice, and represent mean + 
s.d. e, Multiple shRNAs elicit substantial knockdown of Tgfbr3 mRNA in both 
368T1 and 482N1 cells. 482N1 cells were infected with control shRNA (shluc) 
or aset of shRNAs targeting Tgfbr3 (shT gfbr3.1-3.5), and 368T1 WT and ATG 
WT cells were infected with shluc or shT gfbr3.1, 3.2, 3.4 and 3.5. RNA was 
purified and qRT-PCR was performed. Tgfbr3 expression is normalized to 
Gapdh and 368T1 WT and ATG WT and 482N1 values are normalized to 
shluc. Values are technical triplicates, have been performed independently 
three times, and represent mean + s.d. with propagated error. f, Multiple 
shRNAs induce knockdown of Tgfbr3 and suppress TGF-B pathway activity in 
368T1 and 482N1 cells. Cells were infected with shRNAs as in e and Western 
blot analysis was performed for Tgfbr3, pSmad2 and total Smad2 (Smad2). 

g, Tgfbr3 depletion reduces anchorage-independent growth of 368T1 WT and 
ATG WT and 482N1 cells. Cells were infected with the listed shRNAs and 
plated for anchorage-independent growth and colonies were counted as above. 
Values are technical triplicates, have been performed independently three 
times, and represent mean + s.d. ***P < 0.0005; **P < 0.005; *P< 0.05. 


that the Hmga2 ceRNA directly functions by blocking recruitment of 
Tgfbr3 to the Ago2-based miRNA repression complex. 

To examine whether Hmga2 ceRNA activity through Tgfbr3 is 
functionally relevant, we used shRNAs to deplete Tgfbr3 in 482N1 
cells and 368T1 cells expressing either Hmga2 WT or ATG WT. At 
both the mRNA and protein level, multiple shRNAs reduced Tgfbr3 
expression (Figs. 3e, f). Moreover, knockdown of Tgfbr3 led to sub- 
stantial suppression of TGF-B signalling, as evidenced by loss of Smad2 
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phosphorylation, CAGA12 reporter activity and expression of TGF-B 
target genes (Fig. 3fand Extended Data Fig. 8a, b). We then assessed the 
functional effect of Tgfbr3 loss on Hmga2 ceRNA-driven soft-agar 
colony formation. In all cells, Tgfbr3 knockdown notably suppressed 
anchorage-independent growth, but not to the same extent as Hmga2 
depletion in 482N1 cells (Fig. 3g). This occurred without generally 
affecting proliferation, as measured by BrdU incorporation (Extended 
Data Fig. 8c). We further functionally analysed the broader set of six 
Hmga2 ceRNA targets by individually depleting them by short inter- 
fering RNA (siRNA) (Extended Data Fig. 8d). When we compared 
their effects on anchorage-independent growth, both Hmga2 and 
Tgfbr3 loss strongly suppressed growth, Hmgal depletion modestly 
reduced colony formation, whereas the remaining targets had little 
effect (Extended Data Fig. 8e). It should be noted that these other 
targets include extracellular factors like Angptl2 and Colla2 that might 
promote lung cancer progression in vivo in a non-cell-autonomous 
manner, and could thus still be relevant to Hmga2 ceRNA activity in 
lung cancer progression. This considered, these results still indicate 
that Tgfbr3, although certainly not the only relevant Hmga2 ceRNA 
target, is an important effector of Hmga2 ceRNA function in lung 
cancer cell transformation. 

To determine whether this effect of Tgfbr3 is driven through poten- 
tiation of TGF-B signalling, we inhibited the TGF-B pathway with 
the TGF-B-receptor-kinase inhibitor SB-431542 (SB)**. In 368T1 and 
482N1 cells, SB treatment led to a substantial inhibition of Smad2 
phosphorylation (Extended Data Fig. 9a). In addition, SB treatment 
of 482N1 and 368T1 Hmga2 WT and ATG WT cells markedly sup- 
pressed CAGA12 reporter activity and expression of TGF-B target 
genes (Extended Data Fig. 9b, c). We then examined whether SB could 
inhibit Hmga2 ceRNA-driven soft-agar colony formation and observed 
a striking reduction in anchorage-independent growth (Extended Data 
Fig. 9d). This impaired colony formation was not due to general pro- 
liferative arrest, as SB treated cells had a similar rate of BrdU incorp- 
oration (Extended Data Fig. 9e). Notably, many Hmga2 ceRNA targets 
are in fact TGF-B target genes, as their expression is suppressed by SB 
treatment and induced by exogenous addition of TGF-B (Extended 
Data Fig. 9f, g). Thus, it is possible that Hmga2 could function ina feed- 
forward loop in which it regulates TGF-f target genes directly through 
ceRNA function and indirectly through TGF-B signalling via Tgfbr3. 
In summary, these results indicate that TGF-f signalling through Tgfbr3 
is an important pathway downstream of Hmga2 ceRNA function. 

Based on the above findings, we wanted to examine whether HMGA2 
functions as a ceRNA for TGFBR3 (the human orthologues of Hmga2 
and Tgfbr3, respectively) in NSCLC patients. An important corollary 
of the ceRNA hypothesis is the coordinate regulation of a competing 
RNA and its targets, such that upregulation of the ceRNA should lead 
to higher expression of the target RNA and vice versa!”. To assess this, 
we used NSCLC gene-expression data generated by the Cancer Genome 
Atlas (TCGA) and sorted the patient cohort into the top and bottom 
quartiles of HMGA2 expression (HMGAZ2 high and low, respectively) 
(Fig. 4a). As seen in Fig. 4b, we observed significantly higher levels of 
TGFBR3 transcript in HMGA2 high versus low patient samples. To 
address the converse relationship, we sorted the TCGA data set into 
top and bottom quartiles of TGFBR3 expression (TGFBR3 high and 
low, respectively) (Fig. 4c). When we compared HMGA2 transcript levels 
between the groups, we found HMGA2 to be significantly overexpressed 
in TGFBR3 high versus low patient samples (Fig. 4d). To extend and 
validate these findings, we carried out similar gene-expression analyses 
of HMGA2 and TGFBR3 in an independent lung-adenocarcinoma- 
patient gene-expression cohort, the Director's Challenge data set”®. 
Similar to the findings with the TCGA cohort, we observed HMGA2 
and TGFBR3 to be coordinately expressed in the Director’s Challenge 
data set (Extended Data Fig. 10a—d). Although we focused specifically 
on high and low expressors of HMGA2 and TGFBR3, for which 
ceRNA activity is more likely to occur’®, HMGA2 and TGBR3 express- 
ion was broadly correlated across both data sets (Extended Data 
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Figure 4 | HMGA2 and TGFBR3 are reciprocally and coordinately 
upregulated in NSCLC patients. a, The Cancer Genome Atlas (TCGA) 
NSCLC gene-expression data set was sorted according to HMGA2 expression. 
The top and bottom quartiles (HMGA2 low and high, respectively) were 
selected (45 patients per group) and HMGA2 expression was compared using 
box-and-whisker plots. b, The TCGA data set was sorted into top and bottom 
quartiles of HMGA2 expression as in a, and TGFBR3 expression was compared 
using box-and-whisker plots. c, The TCGA data set was sorted into top and 
bottom quartiles of TGFBR3 expression (TGFBR3 low and high, respectively) 
as in a, and TGFBR3 expression was compared using box-and-whisker plots. 
d, The TCGA data set was sorted into top and bottom quartiles of TGFBR3 
expression as in c, and HMGA2 expression was compared using box-and- 
whisker plots. In all box-and-whisker plots, values are presented on a log; scale. 
Significance was assessed by the Mann-Whitney test with a Bonferroni 
correction for multiple hypothesis testing. ***P < 0.0005. e, Model for Hmga2 
acting as a competing endogenous RNA for Tgfbr3. In non-metastatic NSCLC, 
Hmga2 expression is low, leading to suppressed Tgfbr3 expression by let-7. In 
metastatic NSCLC, Hmga2 expression is elevated, titrating away let-7 from 
Tgfbr3 and allowing for its overexpression. This titration occurs without 
changes in let-7 expression, reflecting competition for microRNA occupancy 
by target transcripts. 


Fig. 10e, f). As these two data sets constitute two of the largest collec- 
tions of NSCLC gene-expression data available, we believe that these 
findings are consistent with HMGA2 functioning as a ceRNA for 
TGFBR3 in NSCLC patients. It is possible that the coexpression of 
HMGA2 and TGFBR3 could correspond to additional tumour char- 
acteristics that these data sets do not include; future studies in inde- 
pendent data sets would be needed to assess this issue. In total, our 
results suggest a model in which Hmga2 promotes lung cancer pro- 
gression by competing for let-7 occupancy with other targets, including 
Tgfbr3, leading to the upregulation of those targets (Fig. 4e). Importantly, 
this occurs without changes in the levels of let-7 family microRNAs, 
reflecting specific competition for microRNA binding among targets. 

Here we have outlined a novel gene-expression pathway in which a 
protein-coding gene, Hmga2, operates largely independently of its 
protein-coding function to promote lung cancer progression as a com- 
peting endogenous RNA. Although much of this ceRNA activity is 
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driven by overexpression of TGF-B signalling through Tgfbr3, there 
are likely to be additional Hmga2 ceRNA targets to be found in future 
studies. Moreover, HMGA2 is overexpressed in many other cancer 
types’’, so it is possible that HMGA2 functions as a ceRNA in cancer 
sites beyond lung. Taken more broadly, these findings raise the pos- 
sibility that many protein-coding genes differentially expressed in 
cancer might contribute to tumorigenesis through this distinct mode 
of regulatory gene expression. Moreover, these results raise issues with 
the validation of candidates in RNA interference screens. The “gold 
standard’ assay for validating an siRNA target is expression of an 
siRNA-resistant form of the coding sequence**; however, such an 
approach overlooks the possibility that depletion of both the full- 
length RNA and protein might contribute to a given phenotype, 
requiring complementation by the full-length transcript. Such dual- 
function ceRNA and protein activities necessitate a deeper exploration 
of the coding genome in biological systems. 


METHODS SUMMARY 


Soft-agar assays. Soft-agar assays were performed essentially as described prev- 
iously*. Assays were carried out in triplicate and quantified by microscope. 
Intravenous transplantation. Intravenous injection was performed on 12-week- 
old B6129SF1/Tac male mice (Taconic), essentially as described previously®. In 
short, 10° cells in 50 jul PBS were injected into 12 animals per group. Three weeks 
post injection, animals were scanned using the SkyScan 1176 micro-CT scanner as 
described previously”’. Three mice per group were then euthanized at random for 
surface tumour and histopathological analysis. Surface tumours were quantified 
by counting all the visible tumours on the lung pleura; quantification was carried 
out blind to the expression construct. The remaining nine mice were aged for 
survival analysis. All procedures were performed under an approved project license 
as per UK Home Office regulations. 

RNA immunoprecipitation. RNA immunoprecipitation was carried out on 482N1 
and 368T1 cells with control antibodies (immunoglobulin G (IgG)) or antibody 
targeting Ago2 as per manufacturers’ instructions (Millipore). Total RNA was used 
for either qRT-PCR (quantitative polymerase chain reaction with reverse tran- 
scription) of mRNAs or miRNA-specific qRT-PCR as above. 

Public gene-expression array analysis. NSCLC gene-expression data sets (from 
both The Cancer Genome Atlas (TCGA) and the Director’s Challenge*®) were 
downloaded and processed using standard methods. Patient expression profiles 
were sorted by HMGA2- or TGFBR3-expression status, and the top and bottom 
quartiles of both groups were selected. Target gene expression was then analysed 
and represented as box-and-whisker plots. Statistical significance was assessed 
using Mann-Whitney tests with correction for multiple hypothesis testing. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Mycobacteria manipulate macrophage recruitment 
through coordinated use of membrane lipids 
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The evolutionary survival of Mycobacterium tuberculosis, the cause 
of human tuberculosis, depends on its ability to invade the host, 
replicate, and transmit infection. At its initial peripheral infection 
site in the distal lung airways, M. tuberculosis infects macrophages, 
which transport it to deeper tissues’. How mycobacteria survive in 
these broadly microbicidal cells is an important question. Here we 
show in mice and zebrafish that M. tuberculosis, and its close path- 
ogenic relative Mycobacterium marinum, preferentially recruit and 
infect permissive macrophages while evading microbicidal ones. This 
immune evasion is accomplished by using cell-surface-associated 
phthiocerol dimycoceroserate (PDIM) lipids’ to mask underlying 
pathogen-associated molecular patterns (PAMPs). In the absence 
of PDIM, these PAMPs signal a Toll-like receptor (TLR)-dependent 
recruitment of macrophages that produce microbicidal reactive nitro- 
gen species. Concordantly, the related phenolic glycolipids (PGLs)* 
promote the recruitment of permissive macrophages through a host 
chemokine receptor 2 (CCR2)-mediated pathway. Thus, we have iden- 
tified coordinated roles for PDIM, known to be essential for myco- 
bacterial virulence’, and PGL, which (along with CCR2) is known to 
be associated with human tuberculosis**. Our findings also suggest 
an explanation for the longstanding observation that M. tuberculosis 
initiates infection in the relatively sterile environment of the lower 
respiratory tract, rather than in the upper respiratory tract, where 
resident microflora and inhaled environmental microbes may con- 
tinually recruit microbicidal macrophages through TLR-dependent 
signalling. 

Pattern recognition receptors (PRRs) such as the TLRs enable host 
recognition of diverse microbes through their PAMPs°. Macrophages 
recruited through TLR signalling pathways can eradicate organisms inva- 
ding the oropharyngeal mucosa, for example, Streptococcus pneumoniae’. 
In contrast, pathogenic mycobacteria appear to use macrophages and 
myeloid dendritic cells for transport across epithelial barriers to their 
infection niche’*. Mycobacteria are replete with TLR PAMPs—such 
as lipoproteins and bacterial cell wall peptidoglycan—that have been 
shown to activate cytokine responses in cultured macrophages’. Yet 
in vivo studies find TLR signalling to be dispensable in the early stages 
of infection’, suggesting that mycobacteria have evolved mechanisms 
to circumvent the bactericidal consequences of TLR signalling. 

To explore these mechanisms, we used zebrafish larvae infected with 
M. marinum, a close genetic relative of M. tuberculosis and the causative 
agent of tuberculosis in ectotherms. This model has yielded important 
insights into the pathogenesis and genetics of human tuberculosis’®. In 
humans, the earliest interactions between mycobacteria and phagocytes 
occur at the lung epithelial surface. Such interactions can be modelled 
in the larva by injection of bacteria or other chemical stimuli into the 
hindbrain ventricle (HBV), a neuroepithelium-lined cavity to which 
phagocytes are recruited* (Fig. 1a). We used morpholino knockdown 
to create zebrafish deficient in MyD88, a common downstream adaptor 
molecule for TLR signalling pathways’. As expected, MyD88 morphants 


had decreased macrophage recruitment to Staphylococcus aureus and 
Pseudomonas aeruginosa, mucosal bacteria that can be commensal 
or pathogenic'’"* (Fig. 1b). Similarly, macrophage recruitment to the 
nonpathogenic Mycobacterium smegmatis was MyD88 dependent. In 
contrast, macrophage recruitment to M. marinum was MyD88 inde- 
pendent (Fig. 1c). This finding suggested that pathogenic mycobac- 
teria have the ability to mask PAMPs that would otherwise induce TLR 
signalling during the initial infection phase. We proposed that such a 
factor would be a cell-surface-associated virulence determinant. In this 
light, PDIM seemed a likely candidate, particularly because it is pre- 
sent only in pathogenic mycobacteria, including M. tuberculosis and 
M. marinum, but absent in non-pathogenic M. smegmatis’. We created 
a M. marinum mutant that lacks PDIM on its surface by knocking out 
the PDIM transporter, encoded by the mmpL7 gene, and confirmed 
that it was attenuated in zebrafish larvae (Fig. 1d and Extended Data 
Fig. 2). If PDIM is masking PAMPs, then macrophage recruitment to 
AmmpL7 bacteria should be MyD88 dependent, and this was the case 
(Fig. le). In contrast, macrophage migration remained MyD88 inde- 
pendent in response to M. marinum deficient in another cell-surface- 
associated virulence determinant, Erp (Aerp) (Fig. 1d, e and Extended 
Data Fig. 2)'*. This result was consistent with M. smegmatis possessing 
a functional erp’*, and suggested further that the evasion of MyD88- 
dependent immune detection was mediated specifically by PDIM. 
Our model posits that pathogenic mycobacteria use PDIM to evade 
recruitment of MyD88-dependent macrophage populations detrimental 
to their survival. Therefore, we predicted that wild-type mycobacteria 
would be unaffected in MyD88 morphants, whereas the attenuation of 
AmmpL7 should be reversed. We found both to be the case (Fig. 1f). 
For these assays, approximately 80 M. marinum were injected into the 
HBV. However, MyD88 morphants were previously reported to be 
susceptible to higher M. marinum inocula delivered intravenously”. 
Weconfirmed these findings, showing that MyD88 deficiency increased 
susceptibility at later time points after intravenous administration of >300 
colony forming units (c.f-u.) (Extended Data Fig. 3). It is likely that MyD88 
exerts its protective responses at these later stages through mechanisms 
distinct from the ones we have uncovered, such as through interleukin 
(IL)-1-mediated responses’. Indeed, IL-1 expression was undetectable 
3h after infection, when we observed MyD88-dependent macrophage 
recruitment (data not shown), suggesting an IL-1-independent role for 
MyD838 in mediating recruitment towards PDIM-deficient mycobacteria. 
Further characterization of wild-type versus PDIM-deficient bacteria 
revealed that both strains recruited cells expressing the macrophage- 
specific marker Mpeg] (ref. 8) (Extended Data Fig. 4a and Supplemen- 
tary Videos 1, 2). We next asked whether these macrophages possessed 
differential microbicidal potential. We examined the expression of indu- 
cible nitric oxide synthase (iNOS) in these recruited cells because: (1) it 
is induced in macrophages upon TLR signalling®, and can be expressed 
by zebrafish’®, mouse’’ and human'* macrophages after mycobacterial 
infection; and (2) mycobacteria are known to be susceptible to reactive 
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Figure 1 | PDIM-mediated evasion of MyD88-dependent macrophage 
recruitment. a, Schematic of a 2 days post-fertilization (dpf) zebrafish showing 
the HBV injection site outlined with dashed white line. Scale bar, 500 um. 

b, c, Mean macrophage recruitment at 3 h post-infection (hpi) into the HBV of 
wild-type or MyD88-morphant (MO) fish after infection with 150 S. aureus, 
200 P. aeruginosa (b), 80 M. marinum or 85 M. smegmatis (c). Representative 
of three separate experiments. d, Mean bacterial burdens at 3 dpi after HBV 
infection of wild-type (WT) fish with 80 wild-type, AmmpL7 or Aerp 
M.marinum. Representative of three separate experiments. e, Mean 
macrophage recruitment at 3 hpi into the HBV of wild-type or MyD88- 
morphant fish after infection with AmmpL7 or Aerp M. marinum. 
Representative of four separate experiments. f, Mean bacterial burdens of 
wild-type or MyD88-morphant fish at 3 dpi after HBV infection with wild-type 
or AmmpL7 M. marinum. Representative of three separate experiments. 
Significance testing for all panels done using one-way analysis of variance 
(ANOVA), with Bonferroni’s post-test for comparisons shown. *P < 0.05, 
** P< 0,001. NS, not significant. 


nitrogen species (RNS) in both murine’? and human’* macrophages. 
We found very few iNOS-positive macrophages arriving in response 
to wild-type M. marinum, whereas the majority of those arriving in 
response to AmmpL7 bacteria were iNOS positive (Fig. 2a-c and 
Extended Data Fig. 4b). Aerp bacteria elicited very few iNOS-expressing 
macrophages (Fig. 2c and Extended Data Fig. 4b), further showing that 
this early manipulation of macrophage recruitment and/or activation 
is a specific characteristic of PDIM. We confirmed that RNS were the 
major mediators of MyD88-dependent macrophage microbicidal activity 
by showing that the nitric oxide scavenger carboxy-2-phenyl-4,4,5,5- 
tetramethylimidazolinone-3-oxide-1-oxyl (CPTIO) and N°-nitro-L- 
arginine methyl ester hydro-chloride (L-NAME ) reversed growth 
attenuation of the AmmpL7 mutant (Fig. 2d and Extended Data Fig. 4c). 

Together, our findings suggested that PDIM mediates an immune 
evasion strategy, whereby mycobacteria evade detection by TLRs so as 
to avoid recruitment of iNOS-expressing, microbicidal macrophages. 
To test this idea, we co-infected red fluorescent wild-type bacteria with 
green fluorescent wild-type or AmmpL7 bacteria. We found that wild- 
type bacteria were attenuated in the presence of AmmpL7 bacteria, and 
that this attenuation transfer was specifically caused by co-infection with 
AmmpL7 and not with wild-type or Aerp bacteria (Fig. 2e and Extended 
Data Fig. 5a, b). Furthermore, this transfer of attenuation from AmmpL7 
to wild-type bacteria was dependent on macrophages; no attenuation 
was observed when macrophages were depleted before infection using 
a morpholino against the myeloid transcription factor PU.1 (Fig. 2f)"*. 
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Attenuation transfer was similarly dependent on MyD88 signalling, as 
well as on RNS production (Fig. 2g, h and Extended Data Fig. 5c). 

As PDIM is not the only substrate for the MmpL7 transporter, we 
confirmed that the effects we observed were due to the lack of PDIM 
per se by using a PDIM synthesis mutant, Amas, showing it to both 
recruit macrophages in a MyD88-dependent fashion and to transfer 
attenuation to wild-type bacteria (Extended Data Fig. 6). Finally, to rule 
out the possibility that the PDIM-deficient mutants simply had increased 
expression of the culpable PAMP(s), we co-injected heat-killed, crushed 
wild-type bacteria together with live wild-type bacteria. If the culpable 
PAMP(s) are expressed by wild-type bacteria, then these PAMPs should 
become exposed by crushing the bacteria and cause attenuation of the live 
bacteria. We found this to be the case (Fig. 2i). Altogether, these results 
suggest that PDIM physically masks underlying mycobacterial PAMPs, 
thereby preventing mycobacterial delivery into microbicidal macrophages. 

To corroborate our findings in a second model, we infected mice 
through the aerosol route with wild-type M. tuberculosis (H37Rv) or 
with an isogenic strain (AdrrA) defective for proper PDIM surface 
localization and virulence in mice’. At 21 days post-infection (dpi), we 
found substantially greater proportions of iNOS-producing cells among 
the CD11b* Ly6C™ inflammatory monocyte population in the lungs of 
mice infected with the AdrrA mutant compared to mice infected with 
the wild-type strain (Fig. 3 and Extended Data Fig. 7). Thus, PDIM- 
mediated evasion of TLR-dependent immune recognition is shared by 
M. tuberculosis in the context of the mammalian lung, consistent with 
its central role in avoidance of TLR-dependent antimicrobial mecha- 
nisms such as iNOS and antimicrobial peptides’. 

We next sought to understand the mechanism by which mycobac- 
teria recruit the permissive macrophages that are essential for their trans- 
port into host tissues. Given our previous finding that M. marinum 
recruits only macrophages (and not neutrophils) to the HBV®, we con- 
sidered macrophage-specific chemokines as candidates for mediating 
this recruitment. We investigated CCR2, which has been implicated in 
macrophage migration to bacterial pathogens in mice”, including macro- 
phages that are permissive to M. tuberculosis replication after aerosol 
infection”. We identified the functional zebrafish CCR2 orthologue 
(see Methods) and confirmed that its knockdown resulted in reduced 
macrophage migration in response to recombinant human chemokine 
ligand 2 (CCL2) and not to the closely related human macrophage 
chemokines CCL4 and CCL5 (Extended Data Fig. 8a). The specificity 
of CCL2-mediated macrophage migration was revealed by the follow- 
ing findings: (1) human and mouse CCL2 induced macrophage but not 
neutrophil migration (Extended Data Fig. 8b, c); (2) recombinant human 
IL-8, a neutrophil chemokine, induced neutrophil but not macrophage 
migration (Extended Data Fig. 8b, c); (3) human leukotriene B4 (LTB4) 
induced recruitment of both neutrophils and macrophages (Extended 
Data Fig. 8b, c), as expected’*®; and (4) MyD88 knockdown did not dimi- 
nish CCL2-mediated macrophage migration, ruling out TLR-mediated 
migration in response to any endotoxin that might be contaminating 
the chemokine preparations (Extended Data Fig. 8b). 

CCR2 morphants had reduced macrophage migration in response to 
wild-type M. marinum, confirming the role of this pathway in recruitment 
(Fig. 4a). Recruitment to PDIM-deficient M. marinum was unaffec- 
ted, showing that TLR PAMPs trigger recruitment through a CCR2- 
independent pathway (Fig. 4a). Accordingly, we found that M. marinum 
infection induced CCL2, and that CCL2 morphants also had reduced 
macrophage recruitment in response to infection (Fig. 4b and Extended 
Data Fig. 9). 

Turning to the question of which bacterial determinant induced the 
CCR2 pathway, we considered PGL, a molecule closely related to PDIM 
in both M. marinum and M. tuberculosis’. Although many clinical 
M. tuberculosis isolates have lost PGL, its presence has been linked 
to increased virulence®. Moreover, among M. tuberculosis clinical iso- 
lates, PGL expression was linked to Ccl2 expression in a mouse lung 
infection model”'. Similarly, we found that PGL was required for ccl2 
induction in the zebrafish larva; deletion of the M. marinum pks15 locus 
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Figure 2 | Increased iNOS-dependent microbicidal activity of macrophages 
recruited to PDIM-deficient mycobacteria. a, b, Representative images of 
wild-type (WT) (a) and AmmpL7 (b) M. marinum-infected fish from ¢. N = 13 
(wild-type) and 14 (AmmpL7) larvae per group. Scale bar, 50 um. c, Percentage 
of infected macrophages that were iNOS positive in the HBV at 3 dpi with 80 
wild-type, AmmpL7 or Aerp M. marinum. Representative of three separate 
experiments. d, Mean bacterial burdens of 2 dpf control (CTRL)- or RNS 
scavenger (CPTIO)-treated fish after HBV infection with 80 wild-type or 
AmmpL7 M. marinum. Representative of two separate experiments. NS, 

not significant. e-h, Mean bacterial volume of red fluorescent wild-type 

M. marinum (infection inoculum 30-40) when co-infected with 30-40 green 
fluorescent wild-type, AmmpL7 or Aerp M. marinum at 3 dpi in wild-type 


specifically abrogates PGL, but not PDIM, production (data not shown) 
and resulted in loss of ccl2 induction. AmmpL7 bacteria, which lack 
surface expression of both PGL and PDIM, similarly failed to induce 
ccl2, highlighting that this chemokine is not induced through TLR 
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Figure 3 | Elevated frequencies of iNOS-expressing inflammatory 
monocytes in mice infected with PDIM-deficient M. tuberculosis. 

a, b, C57BL/6 mice were infected through the aerosol route with H37Rv or 
an isogenic PDIM-deficient mutant (AdrrA). Lung tissue was harvested at 
21 dpi and iNOS protein expression was measured using flow cytometry. 
Representative fluorescence-activated cell sorting (FACS) plots (a) and 
graphical depiction (b) of frequencies of iNOS-expressing cells within the 
CD11b*Ly6C™ inflammatory monocyte population. Representative of two 
separate experiments. Student’s unpaired f-test. 
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(e), PU.1-morphant (MO) (f), MyD88-morphant (g), or CPTIO-treated 

(h) larvae. e, g, Co-infection of wild-type and AmmpL7 M. marinum in 
wild-type or MyD88-morphant fish is representative of at least three separate 
experiments, and co-infection with Aerp is representative of two separate 
experiments. f, h, Representative of two separate experiments. a-h, Significance 
testing done using one-way ANOVA, with Bonferroni’s post-test for 
comparisons shown. *P < 0.05, **P < 0.01, ***P < 0.001. i, Mean bacterial 
volume of red fluorescent wild-type M. marinum at 3 dpi (infecting inoculum 
30-40) when co-infected with the volume equivalent of 30-40 heat-killed, 
crushed wild-type M. marinum. Representative of two separate experiments. 
Student’s unpaired t-test. 


interactions, but rather is specifically induced through PGL-mediated 
interactions (Fig. 4b). Furthermore, Apks15 bacteria recruited fewer 
macrophages upon infection of wild-type larvae, and this reduction was 
similar to that seen in CCR2 morphants infected with wild-type bacteria 
(Fig. 4c). There was no additional reduction in recruitment when CCR2 
morphants were infected with PGL-deficient bacteria, suggesting that 
PGL recruits macrophages solely through the CCR2 pathway (Fig. 4c). 

Our findings implicate PGL in bacterial virulence and, correspond- 
ingly, the CCR2 pathway in host susceptibility. Globally, a large pro- 
portion of M. tuberculosis isolates are PGL deficient due to a frameshift 
in pks15 (ref. 2). However, the importance of PGL in mediating viru- 
lence and/or transmission is underscored by its presence in many of 
the W-Beijing strains, which are becoming rapidly enriched among 
M. tuberculosis isolates globally’, and have predominated in outbreaks 
in North America, where tuberculosis is not prevalent”. Infectivity is 
a key requirement for transmission, and our data suggested that PGL 
may enhance infectivity through CCR2-mediated recruitment of per- 
missive macrophages at the earliest stages of infection. This enhance- 
ment may be particularly relevant in the context of human infections, 
in which the infectious dose is thought to be as low as 1-3 bacteria”. 
To test the hypothesis that PGL enhances infectivity at low doses, we 
compared the ability of wild-type and PGL-deficient strains to estab- 
lish infection. Confocal microscopy was used 5 hours after HBV injec- 
tion to select those animals that had received 1-3 bacteria (Extended 
Data Fig. 10), and then again at 5 dpi to identify which animals were still 
infected. We found that 89% of the wild-type but only 18% of the Apks15 
infections were successful (Fig. 4d). Concurrent administration of recom- 
binant CCL2 restored the infectivity of Apks15 bacteria, provided the 
CCR2 pathway was intact (Fig. 4d). Correspondingly, we found that 
wild-type bacteria had a lower infectivity rate in CCR2 morphants 
(Fig. 4d). Consistent with our finding that PGL recruits macrophages 
solely through CCR2, there was no further decrease in infectivity in 
CCR2 morphants infected with the PGL mutant (Fig. 4d). Finally, the 
infectivity of wild-type bacteria in MyD88 morphants was undimi- 
nished (90% for wild type versus 83% for morphants), consistent with 
our finding that TLR signalling is not involved in macrophage recruit- 
ment to wild-type bacteria. 
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Figure 4 | Macrophage recruitment and subsequent infectivity is mediated 
by mycobacterial PGL and host CCR2. a, Mean macrophage recruitment at 
3 hpi into the HBV of wild-type or CCR2-morphant (MO) fish after infection 
with 80 wild-type (WT) or AmmpL7 M. marinum. Representative of three 
independent experiments. One-way ANOVA, with Bonferroni’s post-test for 
comparisons shown. *P < 0.05. b, ccl2 messenger RNA levels (mean + 
standard error of the mean (s.e.m.) of four biological replicates) induced at 3 h 
after caudal vein infection of 2 dpf larvae with 250-300 wild-type, Apks15 or 
AmmpL7 M. marinum. One-way ANOVA with Tukey’s post-test. *P < 0.05. 
c, Mean macrophage recruitment at 3 hpi into the HBV after infection with 
80 wild-type or Apks15 M. marinum. Representative of three separate 
experiments. One-way ANOVA with Bonferroni’s post-test for comparisons 
shown. **P < 0.01, ***P < 0.001. d, Wild-type and CCR2-morphant fish, with 
or without the addition of 5 1g ml ! CCL2, were infected in the HBV with 1-3 
wild-type or Apks15 M. marinum. Graph shows the percentage of fish that were 
infected (black) or uninfected (grey) after 5 days. n = number of larvae per 
group. Representative of two separate experiments. Significance was evaluated 
using Fisher’s exact test for each comparison. **P < 0.01, ***P < 0.001. 

NS, not significant. 


These findings highlight the interdependency between bacterial PGL 
and host CCR2 signalling in driving bacterial infectivity under the low 
inoculum conditions relevant to human infection. Previous investi- 
gations into the role of PGL and CCR2 may have failed to reveal these 
mechanisms because those studies used higher inocula and, in the 
study of CCR2 signalling, a PGL-deficient strain’’. Indeed, our finding 
that CCR2 signalling is a host susceptibility factor is reinforced by human 
studies showing an association between the high expression of CCL2 
and tuberculosis susceptibility*. Furthermore, the association appears 
to be stronger in east Asian populations”, where clinical isolates are 
enriched for the predominantly PGL-expressing W-Beijing strains”’. In 
light of our findings, we propose that the enrichment of PGL expression 
among these strains is influencing this association, as the CCR2 path- 
way would be most relevant in the context of bacterial PGL stimulation. 

Finally, our data suggested an explanation for why M. tuberculosis 
must reach the alveolar surfaces of the distal lung in order to initiate 
infection” (Extended Data Fig. 1). It is well established that tubercu- 
losis results from inhalation of small aerosol droplets containing ~ 1-3 
bacteria, which are capable of reaching the alveolar surfaces of the distal 
lung; in contrast, large droplets harbouring ~10* bacteria are trapped 
in the upper bronchial passages and are far less successful at establish- 
ing infection”. These observations have led to the idea that the alve- 
olar surfaces of the distal lung offer a more favourable environment for 
mycobacterial proliferation. We wondered if commensal microbes from 
the oropharyngeal surfaces, as well as inhaled environmental organisms, 
might lead to continual TLR signalling in the upper respiratory tract 
that would then override the mycobacterial PDIM-dependent immune 
evasion strategies we identified. In contrast, the lower respiratory tract, 
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Figure 5 | MyD88-dependent macrophage recruitment elicited by other 
bacterial pathogens and commensals attenuates pathogenic mycobacteria. 
a, Mean bacterial volume of red fluorescent M. marinum (infecting inoculum 
30-40) after co-infection with either 30-40 green fluorescent M. marinum 
(Mm) or 300 P. aeruginosa (Pa) at 1 and 3 dpi. Representative of three separate 
experiments. Significance assessed using Student’s t-test. b, c, Mean bacterial 
volume of 30-40 red fluorescent wild-type M. marinum (infecting inoculum 
30-40) after co-infection with either 30-40 green fluorescent wild-type 

M. marinum or 300 P. aeruginosa (b) or 300 S. aureus (Sa) (c) at 3 dpi, 

in wild-type or MyD88-morphant (MO) larvae. Significance tested by one-way 
ANOVA with Bonferroni’s post-test for comparisons shown. *P < 0.05. 


which is relatively sterile**, would favour recruitment of Mycobacterium- 
permissive macrophages. To test this hypothesis, we co-infected animals 
with M. marinum together with bacterial colonizers of the pharynx that 
induce TLR signalling—either S. aureus, a common Gram-positive 
colonizer of the nasopharynx in both adults and children'’, or the 
Gram-negative bacterium P. aeruginosa, also reported to colonize 
the pharynx of asymptomatic adults and children’. Co-infection with 
P. aeruginosa resulted in the attenuation of wild-type mycobacteria by 
1 dpi, and continued into 3 dpi (Fig. 5a). Mycobacterial growth was 
attenuated despite rapid clearing of P. aeruginosa: 56% of the animals 
had cleared the co-infected P. aeruginosa by 1 dpi, and 76% by 3 dpi, 
with only a few residual bacteria in the remaining animals. Thus, it was 
not the physical presence of, but rather the detrimental immunological 
milieu induced by P. aeruginosa that was responsible for the attenu- 
ation of M. marinum. Consistent with our hypothesis, we found that 
the detrimental effect of P. aeruginosa on mycobacterial survival was 
MyD88 dependent (Fig. 5b). S. aureus co-infection also had a MyD88- 
dependent detrimental effect on M. marinum survival (Fig. 5c). 

Our previous work identified strategies by which intracellular myco- 
bacteria manipulate host pathways after having traversed epithelial bar- 
riers; these involve a bacterial protein secretion system that expands 
the bacterial niche through macrophage recruitment to the nascent 
granuloma”. We now describe what may be the first contact between 
mycobacteria and their hosts, and the manner in which mycobacteria 
manipulate recruitment, and potentially influence the differentiation 
or activation state of the first responding macrophages so as to gain 
access to their preferred niche (Extended Data Fig. 1). The choreographed 
entry involves two related mycobacterial lipids acting in concert to avoid 
one host pathway while inducing another. Our findings link PDIM, 
recognized as an absolutely essential mycobacterial virulence factor, to 
the evasion of TLR detection and thus explain the dispensability of 
TLR-mediated immunity in protection against M. tuberculosis infec- 
tion in both human and animal studies®”’. In contrast, PGL is dispens- 
able for virulence, being variably present among clinical isolates. Yet its 
presence in the ancestral M. cannetti strains as well as in M. marinum, 
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the closest genetic relative of the M. tuberculosis complex’, suggests its 
integral role in the evolution of mycobacterial pathogenicity. Tuber- 
culosis is an ancient disease, and the enhanced infectivity conferred by 
PGL may have been essential for most of its history before human crow- 
ding, with its greatly increased opportunities for transmission, made 
it dispensable”. 

Our findings suggest a central role for commensal flora in choreo- 
graphing mycobacterial entry. Not only must pathogenic mycobacteria 
possess a physical barrier to prevent host TLR-mediated detection, but 
they must also evade TLR signalling initiated by other organisms, by 
entering through the distal lung (Extended Data Fig. 1). Our work may 
also explain the paradox that smaller M. tuberculosis droplets are more 
infectious than larger ones. However, the requirement placed on myco- 
bacteria to gain entry through the distal lung makes tuberculosis less 
contagious than most other respiratory infections, thus assigning a 
protective role to the commensal flora. Conversely, the persistence of 
human tuberculosis for over 70,000 years” attests to the effectiveness 
of the mycobacterial evolutionary survival kit (masking lipid, recruit- 
ing lipid and small infection droplets) to simultaneously evade and 
manipulate the host and its commensal flora. 


METHODS SUMMARY 


M. marinum, S. aureus and P. aeruginosa constitutively expressed fluorescent proteins 
GFP, Wasabi or tdTomato to allow visualization. Zebrafish larvae (of undeter- 
mined sex given the early developmental stages used) were infected at 36-48h 
post-fertilization (hpf) via caudal vein or HBV injection. Larvae were randomly 
allotted to the different experimental conditions. Fluorescence images were cap- 
tured and quantitative fluorescence was used as a surrogate for bacterial burdens. 
For the macrophage recruitment assays, macrophages and neutrophils in the HBV 
were enumerated using differential interference contrast microscopy 3 h after HBV 
infection. For determination of infection burdens in the HBV, 1 and 3 dpi larvae 
were mounted in 1.5% agarose (low melting point) and confocal z-stacks of 2 {um 
were obtained. For the infectivity assay, 2 dpf larvae were injected in the HBV with 
a concentration of mycobacteria that resulted in an average of 0.8 bacteria per 
injection. Larvae harbouring 1-3 bacteria were identified at 5 hpi using confocal 
microscopy, and were re-imaged at 5 dpi and scored as infected or uninfected. 
Antibody staining for iNOS was performed as described* by confocal microscopy. 
CPTIO (Sigma) was used at a final concentration of 500 1M in 0.1% dimethyl- 
sulphoxide in fish water. Larvae were incubated immediately after infection and 
fresh CPTIO was added every 24h for the duration of the experiment. For quanti- 
tative real-time PCR, complementary DNA was synthesized from pools of 20-40 
larvae as previously described*. ccl2 RNA levels were determined using SYBR 
green and the primers 5’-GTCTGGTGCTCTTCGCTTTC-3’ and 5’-TGCAGAG 
AAGATGCGTCGTA-3’. Ten-week-old female C57BL/6 mice were infected through 
the aerosol route with M. tuberculosis strains. For iNOS staining, lungs were har- 
vested and processed at 21 dpi. Statistical analyses were formed using Graphpad 
Prism software. Zebrafish and mouse husbandry and all experiments performed 
on them were in compliance with Institutional Animal Care and Use Committee 
approved protocols. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 

Bacterial strains and methods. M. marinum strain M (ATCC BAA-535) and the 
Aerp mutant have been described*’. The AmmpL7 and Apks15 mutants were gene- 
rated as described in the following section. Fluorescently labelled bacterial strains 
were generated by transformation with the pTEC15 or pTEC27 plasmids (depo- 
sited with Addgene, plasmids 30174 and 30182, respectively), resulting in msp12- 
driven expression of the Wasabi or tdTomato fluorescent proteins, respectively. 
Mycobacteria were grown at 33 °C in Middlebrook 7H9 broth or on 7H10 agar 
(both by Difco) supplemented with 0.5% bovine serum albumin, 0.005% oleic acid, 
0.2% glucose, 0.2% glycerol, 0.085% sodium chloride and 0.05% Tween-80 (broth 
culture only). 50 jg ml~' hygromycin was added as appropriate. For sucrose counter- 
selection, 7H10 agar was supplemented with 10% sucrose. Single-cell suspensions of 
bacteria were prepared as described*’. To prepare heat-killed crushed M. marinum, 
bacteria were incubated at 80 °C for 20 min and then homogenized in a Biospec Bead 
Beater together with 0.1 mm silica spheres for 1 min. The P. aeruginosa PAO] fluore- 
scent strain used in this study has been described**. The S. aureus Newman strain 
expressing pOS1-SdrC-mCherry #391 was a gift from J. Bubeck Wardenburg. 
Targeted deletion of mmpL7 and pks15. A 2,638 bp PstI fragment containing 
part of the mmpL7 (MMAR_1764) open reading frame (ORF) was cloned into 
a pBluescript-derived vector, pBSXKpn.2 (C.L.C., unpublished observations). A 
1,124 bp KpnI fragment internal to mmpL7 was then excised and replaced with the 
aph cassette, conferring kanamycin resistance. The sucrose counter-selectable 
marker, sacB, and an additional marker, hygA, conferring hygromycin resistance, 
were then added to create pJENK7.1::Hyg. This construct was transformed into the 
wild-type reference strain, M, and kanamycin-resistant colonies were selected. Sub- 
sequent screening for sucrose sensitivity identified merodiploids that were verified 
by Southern blotting. One such merodiploid was then grown in liquid culture for 
10 days and plated on sucrose-containing medium. Sucrose- and kanamycin- 
resistant hygromycin-sensitive colonies were then verified by Southern blotting 
to identify the AmmpL7 mutant, KT15. This strain was verified to be deficient 
in surface localization of PDIM (data not shown), and exhibited colony morpho- 
logy defects previously reported for M. marinum PDIM mutants™. The pks15 
(MMAR_1762) locus was deleted as follows. Flanking regions upstream and down- 
stream of pks15 were amplified by PCR using primers 5'’pks15F (5'-CCGCTCG 
AGGGTGCGATGCGTGGTATC-3’), 5’pks15R.2 and (5’-CGACTAGTTCAGT 
TGCTCCTGTTCATG-3’), 3'’pks15F, and (5’-GGAGCAACTGAACTAGTACC 
ATCCGACACCGACTG-3’) and 3’pks15R.2 (5’-CCGTCTAGAGTGGTGCTG 
TTCGGCGTC-3’), respectively. These fragments were sequentially inserted, directly 
adjacent to each other, into pBluescriptSK + ::SacBHyg.1 (C.L.C., unpublished obser- 
vations), a pBluescript derivative which contains sacB and hygA external to the 
multiple cloning site. The resulting construct, pPKS15KO, bears an unmarked 
deletion of the pks15 ORF, and was used to transform strain M, and hygromycin- 
resistant colonies were selected. Putative merodiploids were verified by Southern 
blotting and then counter-selected on sucrose as described earlier, to produce the 
sucrose-resistant hygromycin-sensitive isolate (KT21), which was then verified by 
Southern blotting. Additional verification by thin-layer chromatography deter- 
mined that PGL was absent, whereas PDIM production was retained (data not 
shown), consistent with deletion of pks15 in M. tuberculosis’. 

Zebrafish husbandry and infections. Wild-type AB zebrafish were maintained as 
described*’. Larvae (of undetermined sex given the early developmental stages used) 
were infected at 36-48 h post-fertilization (hpf) via caudal vein or HBV injection 
using thawed single-cell suspensions of known titre****. The number of animals to 
be used for each experiment was guided by past results with other bacterial mutants 
and/or zebrafish morphants. Larvae were randomly allotted to the different experi- 
mental conditions. Zebrafish husbandry and all experiments performed on them 
were in compliance with Institutional Animal Care and Use Committee approved 
protocols. 

Microscopy and image-based quantification of infection level. Wide-field 
microscopy was performed using a Nikon Eclipse Ti-E equipped with a C-HGFIE 
130W mercury light source, Chroma FITC (41001) filter, and X2/0.10 Plan Apo- 
chromat objective. Fluorescence images were captured with a CoolSNAP HQ2 
Monochrome Camera (Photometrics) using NIS-Elements (version 3.22). Quan- 
tification of fluorescent M. marinum infection using images of individual embryos 
using Fluorescent Pixel Count (FPC) was performed as previously described**. For 
confocal imaging, larvae were imbedded in 1.5% agarose (low melting point)**. A 
series of z-stack images with a 2 um step size was generated through the infected 
HBV, using the galvo scanner (laser scanner) of the Nikon Al confocal micro- 
scope with a X20 Plan Apo 0.75 NA objective. Bacterial burdens were determined 
by using the three-dimensional surface-rendering feature of Imaris (Bitplane Scien- 
tific Software)*. 

Hindbrain assays. Macrophage recruitment assays were performed as previously 
described**. For determination of HBV infection burdens, 1 and 3 dpi larvae were 
mounted in 1.5% agarose and confocal z-stacks of 2 jm were obtained. 
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iNOS staining. Antibody staining of larvae was performed as described*’. Larvae 
were then imaged using confocal microscopy and the number of infected macro- 
phages that were positive for iNOS staining was determined for each larva. 
iNOS scavengers. Fish were treated as previously described**. CPTIO or L-NAME 
(Sigma) were used at a final concentration of 500 1M and 1 mM, respectively, in 
0.1% dimethylsulphoxide in fish water. Fish were incubated immediately follow- 
ing infection and fresh inhibitor was added every 24h until bacterial burden was 
determined. 

Morpholinos. The morpholinos described in Supplementary Table 1 were injected 
at the 1-4-cell stage as previously described’. 

Reverse transcription PCR to verify efficacy of MyD88 morpholino. RNA was 
extracted from pools of 15-40 embryos using TRIzol reagent (Life Technologies), 
treated with Turbo DNA-Free Kit (Life Technologies) and cDNA synthesized with 
PrimeScript (Takara). Primers used for PCR were as follows: actin, forward, 5'- 
ACCTGACAGACTACCTGATG, reverse, 5’-TGAAGGTGGTCTCATGGATAG; 
myd88, forward, 5'-ATGGCATCAAAGTTAAGTATAGACC, reverse, 5’-AGG 
GCAGTGAGAGTGCTTTG. 

Identification of candidate CCR2 orthologue in zebrafish. Basic Local Align- 
ment Search Tool (BLAST) searches of the zebrafish genome (http://www.ensembl. 
org) identified two closely related CCR-like genes on chromosome 16: ENSDARG 
00000079829 and ENSDARG00000062999. In BLAST comparisons to the human 
genome, ENSDARG00000079829 was found to have the highest homology to 
human CCR2 (E value 8.8 X 10° '’”), whereas ENSDARG00000062999 was most 
highly homologous to human CCR4 (E value 2 X 10°). In addition, annotation 
of the zebrafish genome from NCBI annotates ENSDARG00000079829 as a CCR2- 
like gene. We confirmed expression of the mRNA and identified the short 5’ upstream 
exon ATGTCGGCGACACAAAACAGTA using 5’ rapid amplification of cDNA 
ends (RACE). 

Identification of zebrafish CCL2 orthologue. Protein sequences of human and 
mouse CCL2 were used to interrogate the zebrafish genome by BLAST. Expression 
levels of the four most closely related zebrafish proteins were then examined at 
3 hpi to identify the likely functional orthologue (Extended Data Fig. 9a). Of the 
four candidates, only ENSDARG00000041835 was significantly induced at 3 hpi. 
Knockdown of ENSDARG00000041835 resulted in a decrease in macrophage 
recruitment into the HBV at 3 hpi (Extended Data Fig. 9b). 

Quantitative real-time PCR. cDNA was synthesized from pools of 20-40 larvae 
as previously described*’. Quantification of ccl2 RNA levels was determined using 
SYBR green and the following primer pair: 5'-GITCTGGTGCTCTTCGCTTTC-3’ 
and 5'-TGCAGAGAAGATGCGTCGTA-3’. 

Infectivity assay. 2 dpf larvae were infected via the HBV* with an average of 0.8 
bacteria per injection. Fish harbouring 1-3 bacteria were then identified at 5 hpi by 
confocal microscopy. These infected fish were then evaluated at 5 dpi and were scored 
as infected or uninfected, based on the presence or absence of fluorescent bacteria. 
Mice, aerosol infections, and flow cytometry. C57BL/6 mice were purchased 
from Jackson Laboratories. All mice were housed under specific pathogen-free 
conditions at the Seattle Biomedical Research Institute, and all experiments were 
performed in compliance with the respective Institutional Animal Care and Use 
Committee approved protocols. Ten-week-old female mice were randomized to 
the different experimental groups. The number of mice to be used to adequately 
power the experiment was guided by the results of the corresponding zebrafish 
experiments. A stock of M. tuberculosis strain H37Rv or the isogenic PDIM- 
deficient AdrrA strain was sonicated before use and mice were infected in an 
aerosol infection chamber (Glas-Col) with approximately 200 c.f-u. of H37Rv or 
1,000 c.f.u. of AdrrA to achieve similar bacterial burdens at 21 dpi. The infectious 
dose in each experiment was determined by plating lung tissue of two mice from 
each group. Colonies on 7H10 agar plates were counted after 21 days of incubation 
at 37 °C. Lung tissue was perfused with 5 ml of PBS administered through the right 
ventricle of the heart, finely chopped using a gentle MACS Octo Dissociator (Miltenyi 
Biotec) and incubated at 37 °C for 30 min in HEPES buffer containing Liberase 
Blendzyme 3 (Roche Applied Science). After digestion, single-cell suspensions were 
prepared by passing tissue through a cell strainer. Single-cell suspensions were then 
stained for flow cytometric analysis. Lung single-cell suspensions were surface 
stained at 4 °C for 20 min in the presence of Fc block (24G2) with the following 
antibodies from eBioscience: PE-Cy7-labelled anti-CD4 (GK1.5, eBioscience), 
anti-CD8« (53-6.7, eBioscience), anti-CD11lc (N418), and FITC-labelled anti- 
Ly6G (1A8) to exclude T cells, dendritic cells and neutrophils. Alveolar macro- 
phages were excluded based on their high CD1 1c expression and autofluorescence. 
PerCPCy5.5-labelled anti- Ly6C (HK1.4) and APC-efluor-780-labelled anti-CD11b 
(M1/70) were used to identify CD11b*Ly6C™ monocytes. Intracellular staining 
was done after fixation and permeabilization, following the manufacturer’s recom- 
mendations (eBioscience). Cells were fixed and permeabilized using eBioscience’s 
Fix/Perm buffer for 1h at 4°C, followed by staining for iNOS with anti-NOS2 
Alexa Fluor 405 (C-11, Santa Cruz Biotechnology) or mouse IgG1 isotype control 
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for 30 min at 4°C. Samples were analysed on an LSR-II (BD Biosciences) and 
FlowJo Software (Treestar). 

Statistics. Statistical analyses were performed using Prism 5.01 (GraphPad). For 
data sets requiring log), transformation before ANOVA, embryos with no detect- 
able fluorescence above background were assigned a value of 0.9, with 1 being the 
limit of detection, before log;9 transformation. Post-test P values are as follows: 
*P< 0.05; **P < 0.01; ***P< 0.001. 
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Extended Data Figure 1 | Coordinate use of PDIM-mediated immune 
evasion and PGL-mediated recruitment by pathogenic mycobacteria. 
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are shown in the context of the relatively sterile lower airway versus the 
upper airway, with its higher levels of resident microflora and inhaled 
environmental organisms. 
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Extended Data Figure 2 | 4mmpL7 bacteria are attenuated in zebrafish 
larvae. a, Kaplan-Meier graph showing daily survival of larvae infected via 
caudal vein injection with medium (mock), 29 wild-type or 70 AmmpL7 


M. marinum. N = 25 (mock), 31 (wild-type), or 29 (AmmpL7) larvae per group. 


Mean time to death (days): mock (11), wild type (7.6) and AmmpL7 (11.2). 

Survival was compared by log-rank test: wild type versus mock and wild type 
versus AmmpL7, P < 0.0001; mock versus AmmpL7, P = 0.5601. b, c, Larvae 
were infected via caudal vein injection 1 dpf with 550 wild-type, 650 AmmpL7, 


Days post infection 


AmmpL 


or 700Aerp, fluorescent M. marinum. b, Infection burdens were measured by 
Fluorescent Pixel Count (FPC; mean + s.e.m.). c, Representative images at 

7 dpi. N = 29 (wild-type and AmmpL7) or 30 (Aerp) larvae per group. 

Scale bar, 500 jim. At 3, 5 and 7 dpi, log;y FPC was compared by ANOVA, 
with Dunnett’s post-test. ***P < 0.001. d, e, Representative images from 
wild-type (d) and AmmpL7 (e) M. marinum HBV infections quantified in 
Fig. 1d. N = 18 (wild-type) or 16 (AmmpL7) larvae per group. HBVs are 
outlined with a dashed white line. Scale bar, 100 um. 
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Extended Data Figure 3 | Knockdown of MyD88 results in a late, transcript, incorporating a premature stop codon that truncates the protein 
dose-dependent hypersusceptibility to M. marinum systemic infection. before the TIR (Toll/interleukin receptor) domain. b, c, Caudal vein infection of 
a, RT-PCR for actin (top) and myd88 (bottom), demonstrating that that the MyD88 morphants with 141 (b) or 325 (c) c.fu. M. marinum/larva. Bacterial 
majority of myd88 transcripts at 7 dpf are abnormal in MyD88 morphants. burden was assessed by FPC, values plotted represent the mean + s.e.m. 
Lanes marked ‘b’ and ‘c’ correspond to morphants from the same experiments Time points were compared by one-way ANOVA and Bonferroni’s post-tests. 
depicted in panels b and c¢, respectively. The abnormal larger transcript *** PD <0,001. d, Representative images of larvae at 5 dpi from experiment 
(indicated by an asterisk) results from the inclusion of intron 2 in the final in c, N= 30 control, 15 MyD88 morphant. Scale bar, 500 tm. 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b 


ns 


307 305 +e 

mo] 
4 a 
2 : g : 
8 ‘ ‘ 3 . 
o 207 rN 3 205 e 4 
= ry pyyy = e 
® a a J 
ts) a g ee eo 
o “a R D ee see 
2 gs 
P=] $ e $5 soo 
8 104 re a 6 10- ee 3 7 
2 a a S . eo 
P so |e 7 
& a e e 

0 1 ot; ; + 

WT AmmpL7 WT AmmpL7 Aerp 
Total infected 
cells 


Extended Data Figure 4 | Characteristics of macrophages recruited to 
wild-type and PDIM-deficient bacteria. a, Mean Mpeg1-positive 
macrophages recruited at 3 hpi into the HBV of wild-type fish after infection 
with 80 wild-type or AmmpL7 M. marinum. b, Data from Fig. 2c expressed as 
mean numbers of total infected macrophages and iNOS-expressing infected 
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macrophages after HBV infection with 80 wild-type, AmmpL7, or Aerp 

M. marinum. c, Bacterial burdens after L-NAME treatment. Mean bacterial 
burdens of 2 dpf control (CTRL)- or iNOS inhibitor (L-NAME)-treated fish 
after HBV infection with 80 wild-type or AmmpL7 M. marinum. NS, not 
significant. 
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Extended Data Figure 5 | Wild-type bacterial burdens after co-infection (b) M. marinum. N = 18 (wild-type) and 19 (AmmpL7) larvae per group. Scale 
with wild-type or AmmpL7 bacteria. Representative images from the HBV _ bar, 50 ym. c, Wild-type bacterial burdens after co-infection with wild-type or 
co-infections quantified in Fig. 2e. a, b, Red fluorescent wild-type (WT) AmmpL7 M. marinum with and without L-NAME treatment. Significance 

M. marinum co-infected with green fluorescent wild-type (a) or AmmpL7 tested by one-way ANOVA with Bonferroni’s post-test for comparisons shown. 
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Extended Data Figure 6 | MyD88-dependent macrophage recruitment 
occurs in response to PDIM deficiency rather than being due to loss of 
another MmpL7-exported product. a, Mean macrophage recruitment at 3 hpi 
into the HBV of wild-type or MyD88-morphant (MO) larvae after infection 
with 80 Amas M. marinum. Student’s unpaired t-test. b, Mean surviving 
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bacterial volume of red fluorescent wild-type M. marinum (initial infection 
dose of 30-40 c.f.u.) when co-infected with 30-40 green fluorescent wild-type, 
AmmpL7 or Amas M. marinum at 3 dpi. Representative of two separate 
experiments. Significance tested by one-way ANOVA with Tukey’s post-test. 
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Extended Data Figure 7 | Gating strategy and isotype controls for iNOS macrophage, and neutrophil cell populations were excluded from the 
staining of mouse lung. a, Representative gating strategy for isolation of double-negative gate. Inflammatory monocytes were identified within the 
inflammatory monocytes. A dump channel containing anti-CD4, CD8 and double-negative population by their co-expression of Ly6C and CD11b. 
CD11c was plotted against a channel exhibiting autofluorescence and also These cells were then evaluated for intracellular iNOS expression. a, N = 4 per 
containing anti-Ly6G. Using these markers, T cell, dendritic cell, alveolar group (Fig. 3a, b) or b, with isotype control antibodies, N = 4 per group. 
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recruitment in wild-type and CCR2-morphant larvae a, Mean macrophage _ injection of vehicle control (mock), murine CCL2 (mCCL2), human CCL2 

recruitment at 3 hpi into the HBV of control (ctrl), or CCR2-morphant (CCR2) (hCCL2), human IL-8 (hIL-8), or human LTB4 (hLTB4). Representative of 
larvae after injection of vehicle control (‘mock’; 0.1% BSA in PBS), human three separate experiments. Significance assessed by one-way ANOVA with 
CCL2 (hCCL2), human CCL4 (hCCL4), or human CCL5 (hCCL5). b,c, Mean _ Bonferroni’s post-test for the comparisons shown. *P < 0.05; ***P < 0.001. 
macrophage (b) and neutrophil (c) recruitment at 3 hpi into the HBV of control 
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250-300 wild-type M. marinum. These assays were performed on the same 
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Extended Data Figure 10 | Infectivity assay. a,b, Representative 5hpiimages 28, 28, 22, 22 for the respective conditions as specified in the figure). 
from Fig. 4d following HBV infection with one (a) or three (b) M. marinum. __ c, Mean bacterial burdens 5h after HBV infection with 1-3 wild-type (WT), 


Scale bar, 100 jim. N values for fish represented in a and b (that is, those AmmpL7 or Apks15 M. marinum. 
found to be infected with 1-3 bacteria) are presented in Fig. 4d (18, 22, 28, 28, 
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Transcranial amelioration of inflammation and cell 


death after brain injury 


Theodore L. Roth', Debasis Nayak', Tatjana Atanasijevic', Alan P. Koretsky', Lawrence L. Latour! & Dorian B. McGavern! 


Traumatic brain injury (TBI) is increasingly appreciated to be highly 
prevalent and deleterious to neurological function’”. At present, no 
effective treatment options are available, and little is known about 
the complex cellular response to TBI during its acute phase. To gain 
insights into TBI pathogenesis, we developed a novel murine closed- 
skull brain injury model that mirrors some pathological features 
associated with mild TBI in humans and used long-term intravital 
microscopy to study the dynamics of the injury response from its 
inception. Here we demonstrate that acute brain injury induces 
vascular damage, meningeal cell death, and the generation of react- 
ive oxygen species (ROS) that ultimately breach the glial limitans 
and promote spread of the injury into the parenchyma. In response, 
the brain elicits a neuroprotective, purinergic-receptor-dependent 
inflammatory response characterized by meningeal neutrophil swarm- 
ing and microglial reconstitution of the damaged glial limitans. We 
also show that the skull bone is permeable to small-molecular-weight 
compounds, and use this delivery route to modulate inflammation 
and therapeutically ameliorate brain injury through transcranial 
administration of the ROS scavenger, glutathione. Our results shed 
light on the acute cellular response to TBI and provide a means to 
locally deliver therapeutic compounds to the site of injury. 

TBI encompasses injuries that range from mild to severe'”, and 
occurs when the brain is exposed to external forces that induce focal 
and/or diffuse pathologies, including vascular damage, oedema, axonal 
shearing and neuronal cell death*®. TBI is usually divided into two 
phases: the primary insult and the ensuing secondary reaction. It is 
postulated that primary cell death cannot be prevented without avoid- 
ing the injury itself, but that secondary damage is amenable to thera- 
peutic intervention because it is driven by pathogenic parameters 
such as ROS’®, calcium release’, glutamate toxicity’®"’, mitochondrial 
dysfunction”’, inflammation’, and so on. Animal models of TBI have 
been developed that reflect mild, moderate and severe forms of inj ury” ; 
but therapeutic research in these models has not yet translated suc- 
cessfully into the clinic*’*. Thus, there is an increasing need to develop 
additional TBI models, temporally map the dynamics of brain injury 
responses, and devise therapeutic interventions. 

In humans, primary injury to the meninges and vasculature can be 
observed in the absence of conspicuous brain damage after minor head 
trauma. As part of an ongoing study of mild TBI, we evaluated research 
magnetic resonance imaging (MRI) with contrast from patients pre- 
senting to the emergency room within 48h of a minor head injury. 
Over a period of 30 months, 142 patients were enrolled with a baseline 
Glasgow Coma Scale of 15, reporting loss of consciousness or post- 
traumatic amnesia, and a clinical computed tomography (CT) scan with- 
out evidence of injury to the parenchyma. Meningeal haemorrhage 
was seen on CT in 18 patients (12.7%), including subarachnoid blood 
in 13 (9.1%) and subdural blood in 7 (4.9%). Focal enhancement of the 
meninges was observed on post-contrast fluid attenuated inversion 
recovery (FLAIR) MRI imaging (Fig. 1a) in 69 (48.6%) patients, and 
without concomitant meningeal haemorrhage in 53 (36.9%) patients. 
Enhancement is the result of extravasation of gadolinium contrast into 


space containing free fluid with a T1 relaxation time constant equival- 
ent to that of cerebrospinal fluid (CSF)™*. 

To understand better the immunopathogenesis of focal brain injury, 
we developed a novel closed-skull model of mild TBI amenable to intra- 
vital imaging studies. Thinning the murine skull bone to ~30 jim allows 
the underlying meninges and brain parenchyma to be imaged by two- 
photon laser scanning microscopy (TPM) without overt brain injury or 
inflammation’’. Thinning the skull bone beyond 30 1m causes increased 
pliability and concavity, which compresses the meningeal space (referred 
to as a compression injury) (Extended Data Fig. 1). Sequential thinning 
of the skull bone from 50 to 10 um induced increasing amounts of 
meningeal cell death (Fig. 1b). Cell death and inflammation associated 
with over-thinning was reproducibly generated by quickly thinning 
the skull bone to ~20-30 pm and then manually promoting concavity 
with minimal downward pressure (Extended Data Fig. 1). We used this 
model to define the dynamics of inflammation and the mechanisms 
that cause cell death after focal TBI. 

Using TPM we first mapped the kinetics and severity of brain patho- 
logy, starting 5 min after compression injury. Immediately after injury 
quantum dots injected intravenously leaked from vessels into the sub- 
arachnoid and perivascular spaces (Fig. 1c and Supplementary Video 1). 
Within 30 min, ROS were detected in the meninges (Fig. 1d) and holes 
appeared in the glial limitans due to astrocyte cell death (Fig. le and 
Supplementary Video 1). Transcranially administered SR101, a 600 
molecular weight (MW) dye, leaked through the glial limitans into the 
parenchyma after compression injury, but remained largely within the 
meningeal space after standard skull thinning (Fig. 1f). Compression 
also induced cell death in the meninges that increased steadily over 
time, but was not observed in the parenchyma until 9-12 h after injury 
(Fig. 1g, h). Parenchymal cell death at 12 h was indiscriminate, as neu- 
rons, astrocytes, oligodendrocytes and microglia were all lost in the 
lesion site (Extended Data Fig. 2). 

We next sought insights into the dynamics of the innate inflammat- 
ory response. Meningeal macrophages (long, rod-like cells) died within 
30 min of compression injury (Fig. 2a and Supplementary Video 1). In 
response to meningeal cell death, microglia extended processes through 
the compromised glial limitans into the meninges (Extended Data Fig. 3a 
and Supplementary Video 1). We also observed a coordinated micro- 
glial response to compression injury. Most microglia within 50 um 
of the meninges retracted all processes except for ~2-3 that extended 
towards the glial limitans, forming a stable contiguous network resem- 
bling a ‘honeycomb’ structure (Fig. 2b and Supplementary Video 2). 
Long-term TPM revealed that the honeycomb network formed within 
an hour of injury and could be maintained for up to 12 h (Supplemen- 
tary Video 2). Honeycomb microglia surrounded surviving astrocytes 
in the glial limitans and aligned with the junctions between individual 
cells (Extended Data Fig. 3b and Supplementary Video 2). 

In response to astrocyte death (Fig. 1d and Supplementary Video 1), 
a morphologically distinct microglial reaction was observed; microglia 
retracted their ramified processes and extended a single, non-branching 
process towards the glial limitans that resembled a jellyfish (Fig. 2c, 
Extended Data Fig. 3b and Supplementary Video 3). ‘Jellyfish’ microglia 


1National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland 20892, USA. 
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Figure 1 | Pathology associated with compression injury. a, MRI of a 
patient’s brain 19h after a fall from 1.8 m, with reported loss of consciousness, 
post-traumatic amnesia, a Glasgow Coma Scale of 15 on arrival, and a negative 
CT scan. After administration of gadolinium diethylenetriaminepentacetate 
(Gd-DTPA) contrast agent, FLAIR MRI (greyscale images) revealed focal 
enhancement in the meninges along the convexity underlying the area of 
blunt trauma, better visualized on surface-rendered three-dimensional 
FLAIR (pseudo-coloured three-dimensional images), involving the frontal 
and temporal lobes as well as anterior aspects of the cerebral falx. 

b-d, f, g, Maximum projections (5 jm wide) are shown in the xz plane of 
two-photon z-stacks captured through a thinned murine skull. b, Images of 
skull bone (blue) and underlying meninges show sequential skull thinning from 
50 to 10 um. Dead cells (red) were labelled by transcranial propidium iodide 
(PI) administration. Scale bar, 50 um. c, Intravenously injected Q-dots (red) 
leak from blood vessels into the meninges 15 min after compression injury, 


started to form almost immediately after compression injury (Sup- 
plementary Videos 3, 4), and some were motile, whereas others remained 
stationary (Supplementary Videos 3-5). We commonly observed hon- 
eycomb networks of microglia interspersed with clusters of jellyfish 
microglia (Supplementary Videos 4, 5), which probably reflects vari- 
ation in lesion severity along the glial limitans. Honeycomb microglia 
could even transform within 5 min into jellyfish microglia, presumably 
after astrocyte cell death (Supplementary Video 5), whereas naive microglia 
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indicative of vascular damage. Scale bar, 50 um. d, ROS (red) labelled with 
Amplex Red appear in the meninges 30 min after compression injury of 
CX3CRI18?!* (green) mice relative to uncompressed controls. White dotted 
line indicates the glial limitans. macs, macrophages. Scale bar, 30 jum. e, xy 
maximal projections (25 tm in depth) captured in GFAP-GFP mice show 
holes (white arrows) in the glial limitans as a result of astrocyte (green) death, 
which starts to occur 5 min after compression injury. Scale bar, 50 um. 

f, Transcranially applied SR101 (red) diffuses into the brain parenchyma 

30 min after injury, but is largely excluded from the parenchyma in an 
uncompressed control mouse. Scale bar, 50 tum. g, Cell death (PI* cells; red) 
becomes apparent in the brain parenchyma 12 h after compression injury and is 
not observed in uncompressed controls. Scale bar, 50 tum. h, Quantification of 
cell death (mean + standard deviation (s.d.)) in the meninges and parenchyma 
after compression injury. All data in the figure are representative of three mice 
per group (four mice, g) and at least three independent experiments. 


required ~30 min to acquire a jellyfish morphology. Jellyfish projec- 
tions were usually linked to cell bodies via thin processes (Supplemen- 
tary Video 6) and often formed a continuous phagocytic layer at the glial 
limitans (Fig. 2c, Extended Data Fig. 3b and Supplementary Videos 3, 6, 7). 
Over time, microglia residing in the glial limitans died, particularly 
after tissue swelling (or oedema) was observed (Supplementary Video 7). 
Peripherally derived myelomonocytic cells (neutrophils and monocytes) 
also responded to brain damage. Within an hour of compression injury, 
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Figure 2 | Innate immune response to a compression injury. 

a-d, Twenty-five micrometre xy maximum projections from CX3CR1®°’* 
mice (a-c) or LysM2”! * mice (d) captured at 30 min (a), 1h (b), 2h (c) or6h 
(d) in a normal thinned skull preparation (uncompressed) or after a 
compression injury. a, Meningeal macrophages (green) visualized in 
CX3CR1£"’* mice burst and die within 30 min of compression injury 
relative to uncompressed controls. Blood vessels are red. Scale bar, 50 um. 

b, Microglia (green) retract their ramified processes and form a highly 
connected honeycomb network at the glial limitans after compression injury. 
Scale bar, 25 tm. c, A subset of microglia (green) after compression injury 
retract ramified processes and generate a single, flat, motile, phagocytotic 
process at the base of the glial limitans, resembling a jellyfish (examples denoted 
with white asterisks). Blood vessels are red. Scale bar, 25 um. d, LysM2”! + 
neutrophils (green) are recruited to the site of injury, but not to an 
uncompressed thinned skull window. Green cells residing in the uncompressed 
window represent meningeal macrophages. Scale bar, 100 um. Data are 
representative of three mice per group and at least three independent experiments. 


myelomonocytic cells (probably neutrophils) localized exclusively to 
the meninges, were highly motile, and interacted with dead cells during 
the 12h observation period (Fig. 2d and Supplementary Video 8). 

To modulate TBI lesions locally, we applied compounds to the intact 
skull bone. We discovered that SR101, when applied to an intact (non- 
thinned) skull bone passed directly into the meninges within 10 min 
(referred to as a ‘transcranial application’) (Fig. 3a). We next tested a 
range of differently sized fluorescent dextrans (3,000-70,000 MW). 
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Dextrans of 40,000 MW and below were able to pass through the intact 
skull into the meninges (Fig. 3b), although larger dextrans required 
longer diffusion times (Fig. 3c). A 70,000 MW dextran was unable to 
pass transcranially in 30 min. In addition, a variety of fluorescent small 
molecules and macromolecules passed through the murine skull bone 
and achieved measurable steady-state concentrations in the meninges 
dependent on molecular weight (Fig. 3d and Extended Data Fig. 4). 
Passage through an intact skull yielded a meningeal concentration approxi- 
mately one half that achieved by thinned skull application (Extended 
Data Fig. 4d, e). We assessed the feasibility of passing compounds through 
thicker skulls by applying the contrast agent manganese chloride to an 
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Figure 3 | Metrics of transcranial diffusion through the skull bone. a, A 600 
MW fluorescent dye, SR101 (red), was applied to an intact mouse skull for the 
indicated time and then the skull (blue) was quickly thinned and imaged. Five 
micrometre xz maximum projections show that SR101 is detectable in the 
meninges beginning 10 min after application and fully saturates the space 
within 15 min. White dotted line indicates glial limitans. Scale bar, 50 jum. 

b, The size dependence of diffusion through an intact skull bone was evaluated 
30 min after continuous transcranial application of the indicated molecular 
weight dextrans (red). Dextrans that passed successfully through the skull 
generated fluorescence in the meninges. The skull bone is shown in blue, 

and the glial limitans is denoted with a white dotted line. Scale bar, 50 tum. 

c, A colour-coded table summarizing the imaging results shown in panels a 
and b denotes the presence (green) or absence (red) of fluorescent dye in the 
meninges at the indicated molecular weight and time. Grey, not tested. 

d, Fluorescent compounds of increasing molecular weights were passed 
transcranially through a thinned skull window during imaging. Steady-state 
concentrations (mean + s.d.) of the fluorescent compounds in the meninges 
and parenchyma were quantified from normalized fluorescence intensities. See 
also Extended Data Fig. 4. e, Manganese chloride (Mn; 500 mM solution) 
applied transcranially to an intact rat skull (~1 mm thick) is visible by MRI 
in the brain parenchyma 2h after application (white arrow). The mean 
parenchymal manganese concentration + s.d. is provided. Scale bar, 1 mm. All 
data in the figure are representative of three mice (or rats) per group and at least 
three independent experiments. 
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intact rat skull bone (~1 mm thick) and imaging transcranial passage 
by MRI (Fig. 3e). Transcranially applied manganese chloride was clearly 
visible in the rat brain parenchyma 2h after application. 

We next defined the mechanisms underlying compression injury- 
induced inflammation. Trancranial application of purinergic receptor’® 
(P2RY12 or P2RX4) inhibitors before compression injury prevented 
both honeycomb and jellyfish morphologies, whereas P2RY6 antago- 
nism only blocked the jellyfish response (Fig. 4a, b, Extended Data 
Fig. 5a and Supplementary Video 9). In contrast, P2RX7 antagonism 
had no effect on microglia, but almost entirely eliminated neutrophil 
recruitment (Fig. 4a—c, Extended Data Fig. 5b and Supplementary 
Video 9). Astrocytes are known to amplify purinergic receptor signal- 
ling through ATP-induced ATP release via connexin hemichannels, 
which can be blocked with carbenoxolone (CBX)”. Transcranial applica- 
tion of CBX, but not a specific pannexin inhibitor (probenecid), before 
compression injury caused microglia to remain ramified, extending 
only small, ill-defined circular processes at the glial limitans, similar 
to what was observed after P2RY6 antagonism (Fig. 4d, e, Extended 
Data Fig. 5c and Supplementary Video 9). CBX inhibited the formation 
of honeycomb and jellyfish microglia, whereas pannexin inhibition 
slowed the onset and magnitude of neutrophil recruitment (Fig. 4f). 
Pre-treatment with CBX also significantly increased SR101 leakage 
through the glial limitans, suggesting that ATP release by astrocytes 
and purinergic signalling in microglia help maintain barrier integrity 
between the meninges and parenchyma (Fig. 4g and Extended Data 
Fig. 5d). 

To identify the primary mediator of cell death after compression 
injury, we focused on the role of ROS, which appeared in the meninges 
shortly after compression injury (Fig. 1d). Transcranial administration 
of the ROS scavenger glutathione (GSH) resulted in near complete 
survival of meningeal macrophages after a more severe injury (that 
is, skull fracture) (Fig. 5a and Supplementary Video 10) as well as glial 
limitans preservation (Fig. 5b, f). In addition, microglia beneath this 


layer remained in a non-reactive, ramified state (Fig. 5a, e and Sup- 
plementary Video 10). Preservation of the glial limitans after GSH 
treatment resulted in refilling of the subarachnoid space beneath the 
compression injury, which pushed the thinned skull bone upward 
(Supplementary Video 10). GSH administration also eliminated the 
recruitment of myelomonocytic cells (Fig. 5c, g and Supplementary 
Video 10). 

Cell death was first observed in the meninges and later spread to 
the parenchyma after compression injury (Fig. 1h). Transcranial pre- 
treatment with GSH resulted in a 50% reduction in meningeal death, 
but administration after injury had no effect (Fig. 5h), indicating that 
half the initial meningeal cell death following compression injury is 
due to ROS. GSH, when applied continuously starting at 15 min or 3h 
after injury, reduced parenchymal cell death at 12h by 67% and 51%, 
respectively (Fig. 5d, i). These data indicate that ROS are a mediator of 
cell death after compression injury. The contribution of inflammation 
to cell death was assessed by transcranially inhibiting neutrophils and 
microglia with P2RX7 or P2RY6 antagonism, respectively. Inhibition 
of neutrophil recruitment through P2RX7 antagonism increased cell 
death in the meninges 12h later, but had no impact on parenchymal 
cell death (Fig. 5h, i). Conversely, inhibition of microglia through P2RY6 
antagonism increased parenchymal cell death at 12 h, but did not affect 
meningeal cell death (Fig. 5h, i). These data suggest that inflammation 
is neuroprotective within the first 12 h of compression injury. 

TBI induces a complex reaction that can result in permanent damage 
and neurological dysfunction. In this study, we observed evidence of 
meningeal damage in ~50% of patients with mild head injury, indi- 
cating that this is a common pathology in humans. We sought mech- 
anistic insights into this process by developing a novel closed-skull 
model of brain injury and imaging the acute cellular injury response 
from its inception. Importantly, we discovered that the skull bone is 
porous and permits the passage of small molecules (=40,000 MW) and 
contrast agents by passive diffusion, which should facilitate local delivery 
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Figure 4 | Purinergic receptor signalling mediates the innate immune 
response to compression injury. a, Honeycomb microglia were quantified 3 h 
after compression injury in mice treated transcranially with P2RX7, P2RY6, 
P2RX4 and P2RY12 antagonists or vehicle. Uncompressed mice (Uncomp.) 
served as a negative control. Blue dots represent individual microglia, and the 
horizontal red line denotes the mean. b, Quantification of jellyfish microglia 
was performed 3 h after compression injury. Because two compression injuries 
were generated per mouse, data are represented as a ratio (mean + s.d.) of 
purinergic receptor antagonist/vehicle compared with vehicle/vehicle. A ratio 
of one signifies no difference between the two hemispheres. c, Quantification of 
neutrophils per mm? tissue (mean + s.d.) was performed at 1, 3 and 6h after 
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compression injury. d-f, Bar graphs (mean + s.d.) show quantification of 
CX3CR1£"/* microglia with a honeycomb (d) or jellyfish (e) morphology as 
well as the number of LysM2'! * neutrophils (f) after transcranial 
administration of CBX or probenecid. f, Glial limitans permeability was 
quantified by generating two compression injuries per mouse. The mean SR101 
fluorescence in the parenchyma beneath each injury was calculated and 
expressed as a ratio (CBX/vehicle or vehicle/vehicle). A ratio larger than one 
signifies increased permeability in the experimental group. Asterisks in all 
panels denote statistical significance (P < 0.05) relative to the vehicle control 
group. Data are representative of three mice per group and at least three 
independent experiments. 
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a Vehicle Glutathione Figure 5 | Transcranial administration of 
nee glutathione reduces inflammation and cell death 
after compression injury. a-i, Twenty-five 
micrometre xy maximum projections (a, c) and 
5 jum xz projections (b, d) were captured in 
CX3CR1®"'* (a), B6 (b, d) or LysM®”’* (c) mice 
after compression injury (n = 3, a-c, e-g; n = 4, 
Glutathione d, h-i). a, GSH pre-treatment prevented jellyfish 
and honeycomb microglia formation (green) 1h 
after a compression injury that resulted in a 
cracked skull (blue). GSH administration also 
7 vehicle slutanione Ysile promoted survival of meningeal macrophages 
(macs: green; white arrows). Scale bar, 100 Lum. 
b, Relative to the vehicle control group, GSH 
pre-treatment prevented glial limitans breakdown 
observed 1h after injury. SR101 (red) localizes 
above the glial limitans (indicated by a white dotted 
line) in the GSH-treated group. Scale bar, 50 jum. 
c, In GSH pre-treated mice, no neutrophil response 
(green) to compression injury was observed at 6h. 
Scale bar, 100 jum. d, GSH administered 15 min or 
3 hafter a compression injury significantly reduced 
© = 150 oa f,, 93.0) os venic parenchymal cell death observed at 12h. PI* dead 
= = Ss eG itetniag cells (red) reside primarily in the meninges of 
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of therapeutics and other molecules into the central nervous system. 
Pathologically, compression injury initially caused meningeal cell death, 
vascular damage, ROS generation, and disruption of the glial limitans, 
which ultimately gave rise to indiscriminate parenchymal cell death. 
ROS are commonly observed in TBI lesions’®*, and transcranial delivery 
of GSH preserved the glial limitans, reduced cell death, and eliminated 
the sterile injury response. GSH significantly reduced parenchymal cell 
death even when administered 3 h after injury, providing a therapeutic 
window for treatment of focal brain injury. 

In the absence of GSH, the brain responded to damage by eliciting 
an anatomically partitioned sterile immune reaction’®. Microglia first 
fortified the glial limitans through the generation of honeycomb and 
jellyfish structures. Honeycomb microglial networks circumscribed 
individual surviving astrocytes in the glial limitans and were induced 
to do so by the release of ATP from connexin hemichannels and its detec- 
tion by purinergic receptors (P2RX4 and P2RY12)'*'”"°. Phagocytic 
jellyfish microglia were similarly generated by purinergic signalling 
(P2RX4, P2RY6, P2RY12) and filled areas along the glial limitans in 
which astrocyte cell death had occurred. Transcranial inhibition of the 
microglial response through P2RY6 or connexin hemichannel antagon- 
ism increased the permeability of the glial limitans and parenchymal 
cell death after compression injury. Although microglia protected the 
parenchyma, myelomonocytic cells invaded the damaged meninges in 


control group. 


a P2RX7- and pannexin-dependent manner, consistent with a recent 
study showing P2RX7-dependent neutrophil recruitment into the injured 
liver’. Collectively, our data suggest that the acute inflammatory reac- 
tion to brain injury is beneficial’. Moreover, ROS and purines represent 
major drivers of the injury response and are amenable to transcranial 
therapeutic manipulation. 


METHODS SUMMARY 


Human patients presenting to the emergency room within 48 h of mild head injury 
were evaluated as part of an ongoing Traumatic Head Injury Neuroimaging Classifi- 
cation (THINC) study. MRI scans were obtained and evaluated after injection of a 
gadolinium-based contrast agent. To model the pathology of mild head injury, we 
surgically thinned the murine skull bone over the barrel cortex to a thickness of 
~20-30 um and then applied minimal downward pressure to promote concavity 
in the bone (meningeal compression) (Extended Data Fig. 1). Asa control, a similar 
surgical procedure was conducted without applying downward pressure. All intravital 
imaging was performed using a Leica SP5 two-photon microscope and subsequent 
image analysis was performed using Imaris 7.0 software. Transcranial administra- 
tion of compounds was achieved by placing compounds resuspended in artificial 
cerebral spinal fluid (aCSF) directly on top of the exposed skull bone (either 
thinned or completely intact). Compounds entered the CNS via passive diffusion 
through the bone. For these experiments, rodents were subsequently imaged using 
a Leica two-photon microscope or an 11.7-Tesla MRI scanner. Immunohistochemical 
stains of TBI lesions in mice were imaged using an Olympus FV1200 confocal 
microscope. 
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Structural basis for hijacking CBF-B and CULS E3 
ligase complex by HIV-1 Vif 
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The human immunodeficiency virus (HIV)-1 protein Vif has a cent- 
ral role in the neutralization of host innate defences by hijacking 
cellular proteasomal degradation pathways to subvert the antiviral 
activity of host restriction factors’ °; however, the underlying mech- 
anism by which Vif achieves this remains unclear. Here we report a 
crystal structure of the Vif-CBF-B-CUL5-ELOB-ELOC complex. 
The structure reveals that Vif, by means of two domains, organizes 
formation of the pentameric complex by interacting with CBF-B, 
CUL5 and ELOC. The larger domain (a/B domain) of Vif binds to 
the same side of CBF-B as RUNX1, indicating that Vif and RUNX1 
are exclusive for CBF-f binding. Interactions of the smaller domain 
(a-domain) of Vif with ELOC and CUL5 are cooperative and mimic 
those of SOCS2 with the latter two proteins. A unique zinc-finger 
motif of Vif, which is located between the two Vif domains, makes 
no contacts with the other proteins but stabilizes the conformation 
of the a-domain, which may be important for Vif-CUL5 interaction. 
Together, our data reveal the structural basis for Vif hijacking of the 
CBF-f and CULS5 E3 ligase complex, laying a foundation for rational 
design of novel anti-HIV drugs. 

Human primary cells express restriction factors including APOBEC3 
family members to block the replication and spread of HIV, a group of 
obligatory intracellular retroviruses’ ''. One common strategy used by 
HIV-1 for its replication in host cells is to hijack cellular proteasomal 
degradation pathways to degrade the host restriction factors. A critical 
HIV protein involved in this process is HIV-1 virion infectivity factor 
(Vif) that is expressed in most lentiviruses’*. Interaction with Vif 
results in recruitment of APOBEC3G to an E3 ubiquitin ligase complex 
containing the scaffold protein cullin5 (CUL5) and substrate adaptors 
elongin B (ELOB) and elongin C (ELOC), promoting APOBEC3G poly- 
ubiquitination and degradation and thereby damping APOBEC3G- 
mediated cellular defences’®. A conserved Vif motif, called BC-box 
(residues 144-155), is required for Vif interaction with ELOB-ELOC 
through mimicking a conserved cellular SOCS-box motif of the SOCS- 
box proteins’. The HCCH (His 108, Cys 114, Cys 133 and His 139) 
motif of Vif, a unique zinc finger motif, is also important for Vif binding 
to CULS5, Vif-mediated degradation of APOBEC3G and HIV-1 infecti- 
vity’’. Another host protein, core-binding factor subunit beta (CBF-f), 
was recently shown'*”* to be simultaneously hijacked by HIV-1 Vif to 
form the Vif-CBF-B-CUL5-ELOB-ELOC complex. Vif is considered 
a good target for anti-HIV drugs because of its essential roles in HIV-1 
infection; however, the structure of HIV-1 Vif alone or in the context of 
functional complexes is lacking. 

To facilitate structural study, we reconstituted a Vif (residues 1-192)- 
CBF-B (residues 1-170)—CULS5 (residues 12-386, nCUL5)—ELOB (resi- 
dues 1-102)-ELOC (residues 17-112) complex using purified proteins 
(Fig. la and Extended Data Fig. 1a). The absence of Vif-CBF-B reduced 
the interaction between the nCUL5 fragment and the ELOC-ELOB 
complex (Fig. 1a), indicating that the former two proteins have a critical 
role in promoting assembly of the pentameric complex. This is consistent 


with the observation’*’ that SOCS2 is important for CUL5 interaction 
with ELOC-ELOB. The pentameric complex was crystallized and its 
structure was determined using molecular replacement (Extended Data 
Table 1 and Extended Data Figs 1b and 2). The overall complex struc- 
ture has a U-shaped architecture, with nCULS5 and CBF-B-Vif corres- 
ponding to the two straight arms (Fig. 1b). Interaction of ELOC with 
nCULS and Vif forms the bent arm of the U-shaped structure. The Vif 
structure can be divided into a larger and a smaller domain, with a zinc 
ion binding between them (Fig. 1b). One side of CBF-f forms extensive 
contacts with the larger domain of Vif, whereas the carboxy-terminal 
peptide of CBF-B is sandwiched between the two domains of Vif. Vif 
also interacts with nCULS. Thus, Vifhas a central role in organizing the 
Vif-CBF-B-CUL5-ELOB-ELOC pentameric complex. Consistent with 
a previous study'*, CBF-B-Vif binding causes no marked conforma- 
tional changes in CUL5-ELOB-ELOC (Extended Data Fig. 3). ELOB 
makes no interaction with components other than ELOC, but we cannot 
rule out the possibility that the C-terminal portion (residues 103-118) 
of ELOB that was absent from the truncated ELOB in our construct is 
involved in the formation of the pentameric complex. 

CUL1 isa close homologue of CULS5 and forms a complex with SKP1, 
a substrate adaptor of cullin-RING E3 ubiquitin ligases’’. The CUL1- 
SKP1 and nCUL5-ELOC complexes share a similar structural organ- 
ization, with the amino termini of CUL1 and CULS binding to SKP1 and 
ELOC, respectively (Fig. 1c). This observation supports the idea that 
the cullin-RING E3 scaffold proteins have a conserved mode of binding 
to substrate adaptors. However, the amino acids at the nCUL5-ELOC 
interface differ from those at the CUL1-SKP1 interface (Extended Data 
Fig. 4a—c), indicating that these residues govern specific interaction of 
cullin-RING E3 scaffold proteins with their respective substrate receptors. 

Vifin the complex maintains an elongated, cone-like shape (Fig. 2a) 
with highly positive charges on its surface (Fig. 2b). A database search 
using the DALI server (http://ekhidna.biocenter.helsinki.fi/dali_server) 
identified no structures appreciably similar to that of Vif, indicating that 
the viral protein possesses a novel fold. The two-domain structure of 
Vif, however, is reminiscent of the cellular substrate receptors SOCS2 
and VHL that share the common substrate adaptor ELOBC with Vif 
(refs 16, 18, 19). The larger domain (referred to «/f domain) of Vif 
contains a distorted five-stranded antiparallel B-sheet with three heli- 
ces tightly packing against the convex side (Fig. 2a). Two loosely pack- 
ing helices form the smaller domain (referred to as the «-domain) of 
Vif that harbours the BC-box motif (residues 144-155) found in VHL 
and SOCS-box proteins'*'*"’. The zinc-finger motif, HCCH, makes tet- 
rahedral coordination to Zn** (Fig. 2c and Extended Data Fig. 2b) and 
stabilizes the three inter-domain loops (Fig. 2a). Stabilization of the 
three inter-domain loops is further strengthened by their interaction 
with each other, which in turn rigidifies the two helices from the 
a-domain of Vif. The amino acids from the two domain regions, in 
particular those involved in packing of the secondary structural elements, 
are conserved among the Vif family of proteins (Extended Data Fig. 5a). 
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a Input GST pull down 
GST-nCul5 + - - - - + + + 
Vif-CBF-B-ELOB-ELOC - + - - + + - = 
GST = Se 

Vif-CBF-B - - + - = = + = 

ELOBC - - - oS = SF 


Figure 1 | U-shaped structure of the Vif-organized Vif-CBF-B-CUL5- 
ELOB-ELOC complex. a, Vif and CBF-f are important for ELOB-ELOC 
interaction with nCULS (residues 12-386). GST-nCULS5 was first bound to 
glutathione-sepharose and incubated with Vif-CBF-B-ELOB-ELOC, Vif- 
CBF-B or ELOB-ELOC protein as indicated. After extensive washing, the 
bound proteins were visualized by Coomassie blue staining after SDS-PAGE. 
b, Overall structures of Vif-CBF-B-nCUL5-ELOB-ELOC in two different 


CBF-f and Vif form extensive interactions, burying a total surface 
area of 4,797 A*. CBE-f binds to a hydrophobic surface of Vif, leaving 
the positively charged surface exposed (Fig. 3a). The N-terminal pep- 
tide (residues 6-12) of Vif forms an antiparallel B-sheet with B-strand 
S3 from CBF-B (Fig. 3a, b). Vif residues Trp 5, Val 7 and Ile 9 from the 
peptide point to the central region of the B-barrel and make hydrophobic 


orientations. Colour codes for the proteins are indicated. The grey sphere 
indicates Zn**. N and C represent N and C termini, respectively. Residue 
numbers are indicated. c, CUL5-ELOC and CUL1-SKP1 have a similar 
structural organization. Shown in the figure is structural superimposition of 
CULI (green)-SKP1 (blue) (Protein Data Bank codel1LDK)"”” and CUL5-— 
ELOC highlighting the conserved interface between the two complexes. 


contacts with their respective neighbouring residues of CBF-f (Fig. 3b). 
A C-terminal peptide of CBF-B contributes to the CBF-B-Vif inter- 
action via binding to a surface pocket formed between the two domains 
of Vif, rendering the Vif-bound Zn** completely solvent inaccessible 
(Fig. 3a, c). Packing of the C-terminal portion of helix H5 from CBF-B 
against the central region of Vif also seems to be critical for the CBF-B-Vif 


c 


Figure 2 | Vif contains two domains with zinc binding between them. 
a, Overall structure of Vif shown in cartoon representation. Some of the 
secondary structural elements of the o-domain and «/$ domain of Vif are 
labelled. b, Vif has a highly positively charged surface. Two views of the 
electrostatic surface potential map of Vif are shown. White, blue and red 
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indicate neutral, positive and negative surfaces, respectively. c, The 
zinc-binding motif of Vif. The side chains of zinc-finger binding residues 
(His 108, Cys 114, Cys 133 and His 139) are shown as stick representation, 
and the Zn** as a grey sphere. 
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Figure 3 | Vif and RUNX1 overlap when interacting with CBF-B. a, Overall 
structure of the Vif-CBF-8 complex (CBF-B in cartoon representation and 
Vif in electrostatic surface representation). Two regions of the Vif-CBF-B 
interaction are highlighted with purple and blue frames. Some of the secondary 
structural elements of CBF-B are labelled. b, A close-up view of the detailed 
interactions between Vif and CBF-f highlighted in a. Side chains from Vif 
involved in the interaction are shown in slate, and those from CBF-B in yellow. 


interaction (Extended Data Fig. 6a). Consistent with the extensive 
CBF-$-Vif interactions, a CBF-B variant (residues 1-140) retained 
the ability to interact with Vif? (Extended Data Fig. 6b). 

As a co-transcription factor of the RUNX family members, CBF-B 
associates with RUNX1, regulating expression of immunity-related 
genes”. Structural superposition of the CBF-B-RUNX1 (ref. 22) and 
CBF-B-Vif complexes showed that Vif completely overlaps with RUNX1 
(Fig. 3d), indicating that Vif and RUNX1 are mutually exclusive for 
binding CBF-f. Supporting the structural observation, mutation of 
Asn 104 of CBE-B, which interacts with both Vif and RUNX1 (Fig. 3d), 
results in disruption of the Vif-CBF-B and RUNX1-CBF-f complexes”. 
In contrast, mutation of CBF-B Phe 68, which only interacts with Vif 
(Fig. 3d), impairs the Vif-binding activity of CBF-B but has no effect on 
the interaction of RUNX1 with CBF-f (ref. 24). A greater buried sur- 
face area as a result of Vif binding to CBF-B (4,797 A?) than RUNX1 
(3,941 A”) indicates that CBF-f might have a higher affinity for Vif 
than RUNX1. 

In addition to CBF-B, Vif also interacts with CUL5 and ELOC via 
its o-domain (Fig. 4a). As observed previously’, the BC-box motif 
(helix «4) of Vif contacts H4 and its preceding loop of ELOC (Fig. 4b). 
Consistently, mutation of two equivalent HIV-1 Vif residues in HXB2 
Vif*—I120S and L124S—or other key residues impaired Vif inter- 
action with CULS. The structural observations were further confirmed 
by mutations of other key residues involved in Vif-ELOC interaction 
(Extended Data Fig. 7). Structural comparison between the Vif-ELOC 
and SOCS2-ELOC’”* complexes (Fig. 4c) showed that, despite their 
low sequence homology, «4 of Vif is well aligned with H4 of SOCS2, 
indicating that Vif mimics SOCS2 for binding to ELOC. SKP1 shares 
sequence homology with ELOC’”. Comparison of the Vif-CUL5-ELOC 
and CUL1-SKP1-SKP2 complexes” showed that «4 and «3 of Vif are 
also similarly positioned to the helix H6 of SKP1 and helix H1 of SKP2, 
respectively, to bind ELOC and CULS (Extended Data Fig. 4d), further 
supporting the idea that the cullin-RING E3 scaffold proteins have a 
conserved assembly mode. 


LETTER 


RUNX1-bound CBF-B 


Red dashed lines represent hydrogen bonds. c, A close-up view of the detailed 
interactions between a Vif hydrophobic pocket and CBF- C terminus 
highlighted in a. d, Vif and RUNX1 bind to the same side of CBF-B. Shown 
is the structural comparison of the Vif-CBF-B and RUNX1-CBE-B (Protein 
Data Bank code1E50)” complexes. Side chains of two residues (Phe 68 and 
Asn 104) from CBF-f are shown in stick representation and are highlighted. 
The Vif- and RUNX1-bound CBF-B is shown in green and purple, respectively. 


CULS5 and CUL2 share the evolutionarily conserved adaptor protein 
ELOC, which assembles cullin-RING E3 ligase complexes”*. Vif, how- 
ever, preferentially uses CUL5, but not CUL2, as the scaffold protein to 
degrade APOBEC3G (ref. 3). Primary sequence alignment indicated 
that three Vif-interacting amino acids of CUL5—Leu 52, Trp 53 and 
Asp 55—are highly variable in CUL2 (Fig. 4d), indicating that they are 
the structural determinants for CULS selection by Vif. Simultaneous 
substitution in CUL5 of the three Vif-interacting residues Leu 52, Trp 53 
and Asp 55 and the two ELOC-interacting residues Phe 41 and His 48 
with their equivalents in CUL2 greatly impaired the ability of CUL5 to 
interact with Vif-CBF-B-ELOB-ELOC (Fig. 4e). A similar result was 
also obtained for the L52V and W53A double mutation of CUL5 
(Fig. 4e), indicating that these two amino acids have a dominating role 
in CULS selection by Vif. CUL5 L52V and W53<A are also the epitopes 
for preferential selection of CUL5 over CUL2 by SOCS2 (ref. 16). Struc- 
tural comparison (Fig. 4c) revealed that «3 and its following loop (resi- 
dues 116-131) from the «-domain of Vifare positioned similarly to H5 
and H6 of SOCS2 (residues 177-192), a region termed the SOCS2 
cullin box which is responsible for CUL5 binding, further supporting 
the structural and functional mimic of SOCS2 by Vif. These data 
strongly suggest that residues 116-131 in the o-domain of Vif act as 
a cullin box to preferentially recognize CUL5. 

Our structure reveals that Vif has a critical role in organizing assem- 
bly of the Vif-CBF-B-nCUL5-ELOB-ELOC penatmeric complex by 
interacting with CBF-B, nCUL5 and ELOC (Fig. 1). Supporting this 
conclusion, nCULS5 displays a weaker interaction with ELOC-ELOB 
in the absence of Vif-CBF-B (Fig. 1a). Additionally, mutation of 
Vif Leu 145, which is important for the Vif-ELOC interaction, abolishes 
binding of Vif to CULS (ref. 12). Although not interacting with the other 
components in the pentameric complex, the zinc-finger motif may 
stabilize the conformation of the o-domain to promote Vif interaction 
with CULS. Indeed, the F115A mutation of Vif predicted to perturb the 
a-domain (Extended Data Fig. 8) greatly reduces Vif interaction with 
CULS5 and compromises Vif-mediated APOBEC3G degradation”. The 
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Figure 4 | Interaction of Vif with nCUL5 and ELOC. a, The «-domain of 
Vif interacts with nCUL5 and ELOC. Overall interaction of two helices of 
a-domain with nCUL5 and ELOC is shown. b, Specific interactions of Vif with 
nCULS5 and ELOC. A close-up view of the interactions (highlighted in a) of 
helices 03 and 04 of Vif with helix H2 and its following loop of nCUL5 and H4 
of ELOC, respectively. c, Vif structurally mimics SOCS2. Structural comparison 
of Vif-ELOC-CULS and SOCS2 (blue)-ELOC (light blue)-CUL5 (lemon) 
(Protein Data Bank accession 4JGH)"*; the interfaces of the two complexes 
are highlighted in red. d, Sequence alignment of CUL5 and CUL2 around the 


C-terminal CBF-B peptide may further stabilize the conformation of 
the a-domain of Vif by binding between the two domains of Vif. 
Consistently, deletion of the CBF-B peptide greatly impaired the 
assembly of the Vif-containing pentameric complex (Extended Data 
Fig. 6b), although the peptide is not involved in binding to CUL5 or 
ELOC. The three Vif-interacting proteins selectively bind to the hydro- 
phobic surfaces of Vif, leaving the positively charged patches solvent- 
exposed (Extended Data Fig. 5b), some of which are important for 
APOBEC3G binding’’””*. Although the APOBEC3G-binding regions 
of Vif vary in different studies”, the critical amino acids from them 
are exposed in the structure (Extended Data Fig. 5c), indicating that 
extensive interactions between Vif and APOBEC3G may be formed. 

Our data (Fig. 4c) indicate that the «-domain of Vif structurally and 
functionally mimics SOCS2. Vif and SOCS2 probably use a conserved 
mechanism for selecting the adaptor protein CULS5, because mutation 
of the non-conserved CUL5 residues Val52 and Trp 53, which are 
important for the interaction with SOCS2, also greatly reduces the 
efficiency for assembly of the Vif-organized complex (Fig. 4e). In addi- 
tion to affinity (Extended Data Fig. 9), higher abundance of CUL5 than 
that of CUL2 in certain infected cells (http://biogps.org) may also 
contribute to Vif selection of CUL5 during HIV-1 infection. In line 
with this, VHL interacts with both CUL2 and CUL5 when overexpressed, 
although VHL selects CUL2 under physiological conditions". 

We have solved the crystal structure of a Vif-CBF-B-nCUL5-ELOB- 
ELOC complex. Our data not only reveal the molecular mechanism 
underlying hijacking of the CBF-B and CULS E3 ligase complex by Vif, 
but also open the possibility of designing and developing novel anti- 
viral drugs that target the Vif-containing complex. 


METHODS SUMMARY 


6X His-Vif, CBF-8 and ELOB-ELOC were co-expressed in Escherichia coli BL21 
(DE3) cells and purified using Ni- NTA (Qiagen) followed by anion ion exchange 
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Vif-interacting region of CUL5. Conserved and similar residues are highlighted 
with red and yellow shading, respectively. Residues of CULS involved in Vif 
interaction are indicated with slate solid dots at the bottom. e, Structural 
determinants for preferential binding of Vif to nCULS. Wild-type Vif-CBF-B- 
ELOB-ELOC complex in which Vif was tagged with poly-His was first bound 
to Ni® resin and incubated with wild-type (WT) or mutant GST-nCUL5 
proteins. The bound proteins were visualized by Coomassie blue staining 
after SDS-PAGE. 


chromatography (Source-15Q, Pharmacia). Glutathione S-transferase (GST)-nCUL5 
was expressed and purified separately. Pentameric complex was formed by mixing 
Vif-CBF-B-ELOB-ELOC and nCULS proteins. After limited proteolysis with 
elastase, Vif-CBF-B-nCUL5-ELOB-ELOC was cleaned using an anion ion exchange 
column and subjected to size-exclusion chromatography (Superdex200 10/300 GL 
column, GE Healthcare). Crystals of Vif-CBF-fB-nCUL5-ELOB-ELOC were grown 
using the hanging-drop vapour diffusion method and the structure was deter- 
mined using molecular replacement. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Structural basis of lentiviral subversion of a cellular 
protein degradation pathway 
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Lentiviruses contain accessory genes that have evolved to counteract 
the effects of host cellular defence proteins that inhibit productive 
infection. One such restriction factor, SAMHD1, inhibits human 
immunodeficiency virus (HIV)-1 infection of myeloid-lineage cells'* 
as well as resting CD4™ T cells** by reducing the cellular deoxynu- 
cleoside 5’-triphosphate (ANTP) concentration to a level at which 
the viral reverse transcriptase cannot function’. In other lentiviruses, 
including HIV-2 and related simian immunodeficiency viruses (SIVs), 
SAMHD1 restriction is overcome by the action of viral accessory 
protein x (Vpx) or the related viral protein r (Vpr) that target and 
recruit SAMHD1 for proteasomal degradation”*. The molecular 
mechanism by which these viral proteins are able to usurp the host 
cell’s ubiquitination machinery to destroy the cell’s protection against 
these viruses has not been defined. Here we present the crystal struc- 
ture of a ternary complex of Vpx with the human E3 ligase substrate 
adaptor DCAF1 and the carboxy-terminal region of human SAMHD1. 
Vpx is made up of a three-helical bundle stabilized by a zinc finger 
motif, and wraps tightly around the disc-shaped DCAF1 molecule 
to present a new molecular surface. This adapted surface is then able 
to recruit SAMHD1 via its C terminus, making it a competent sub- 
strate for the E3 ligase to mark for proteasomal degradation. The 
structure reported here provides a molecular description of how a 
lentiviral accessory protein is able to subvert the cell’s normal protein 
degradation pathway to inactivate the cellular viral defence system. 

HIV-1 infection of myeloid and CD4* T cells is inhibited by the 
post-entry restriction factor SAMHD1. In other primate lentiviruses, 
including HIV-2 and SIV, this block is overcome by the expression of 
the Vpx accessory protein. Vpx recruits SAMHD1 to the DDB1-CUL4A- 
ROCI E3 ubiquitin ligase complex through interaction with the substrate- 
adaptor protein DCAF1, and facilitates its degradation through the 
proteasomal pathway'*”*®. To understand the mechanism of Vpx-mediated 
recruitment of SAMHD1, we assessed which regions of each molecule 
(Fig. 1a) are required for the interaction. These data reveal that only 
SAMHD!1 molecules containing a C-terminal region (residues 582-626) 
are able to support ternary complex formation (Fig. 1b, compare centre 
and left panels) and that this region alone is sufficient for the inter- 
action (Fig. 1b, right). We therefore determined the crystal structure 
of the ternary complex of the C-terminal WD40 domain of DCAF1 
(DCAF1-CtD) with the Vpx of SIV from the sooty mangabey (Cercocebus 
atys) (VpXsm) and the C-terminal region of SAMHD1 (SAMHD1-CtD). 
The crystal structure was solved by single-wavelength anomalous dis- 
persion (SAD) (Extended Data Fig. 1 and Extended Data Table 1) and 
is shown in Fig. 1c. DCAF1-CtD comprises a seven-bladed B-propeller 
disc-shaped molecule 45 A in diameter and 20 A in depth. Vpx,,, com- 
prises an antiparallel V-shaped three-helical bundle that wraps around 
one side and the top of DCAF1-CtD. This arrangement of helices is 
conserved in the HIV-1 Vpr solution structure’. However, the structures 
differ significantly at the helical termini and, in Vpx,,,, zinc coordinated 
by His 39, His 82, Cys 87 and Cys 89 (Figs 1c and 2a) brings together 
the C termini of helices 1 and 3 to stabilize the structure. Residues 


Asn 606 to Asp 624 of SAMHD1-CtD are also well ordered. They form 
two short perpendicular «-helices, helix A (Leu 610-Ala 613) and helix 
B (Arg 617-Lys 622), connected by a three-residue linker (S614-S616), 
and pack into a cleft between Vpx,, and DCAF1-CtD (Fig. 1c). 

The complex contains four interfacial regions (Fig. 2b), a combined 
VPpXsm-SAMHD1-DCAFI ternary interface (Fig. 2c) and a more exten- 
sive DCAF1-Vpx,-binding surface with three sites of interaction 
(Fig. 2d-f). The VpxXsn-SAMHD1 interaction buries 700 A? of molecu- 
lar surface. At the interface the hydrophobic side chains of Leu 610, 
Val 618, Leu 620 and Phe 621 from SAMHD1 helix A and B pack into a 
hydrophobic pocket between the amino termini of Vpx,,, helix 1 and 3 
(Fig. 2c). The interface also contains electrostatic interactions between 
acidic residues Glu 15 and Glu 16 at the N terminus of Vpx,., helix 1 
and Arg 609 and Arg 617, part of a tribasic Arg 609-Arg 617-Lys 622 
motif, in SAMHD1. By contrast, the contact area between SAMHD1- 
CtD and DCAFI-CtD is small, just 210 A”. The only direct interaction 
is between Lys 622 in the SAMHDI1 tribasic motif and Asp 1092 of 
DCAF1 in the acidic Glu 1091-Asp-Glu 1093 loop connecting blade 
7 and 1 of the DCAF1-CtD propeller (Fig. 2c). Asp 1092 of SAMHD1- 
CtD is also hydrogen bonded to Tyr 66 of Vpx;m in an interaction that 
bridges SAMHD1-CtD and Vpx,.. Several other key residues in VpXsm 
also mediate bridging interactions. These include Trp 24, which stacks 
against Arg617 of SAMHD1-CtD as well as hydrogen bonding to 
Asn 1135 of DCAF1 (Fig. 2e), and Tyr 69, which packs with Val 618 
of SAMHD1-CtD and is also hydrogen bonded to the side chain of 
Glu 1091 in the DCAF1-CtD Glu-Asp-Glu loop. 

The DCAF1-Vpx,,, interface is much larger, made up from three 
sites of interaction burying 2,000 A of surface (Fig. 2d-f and Extended 
Data Table 2). The first involves the N-terminal extended region of VpX.n 
(Glu 6-Ser 13), which packs against the concave surface of DCAF1 blade 1 
spanning from the underside to the topside of the disc, making several 
hydrogen bonds and hydrophobic interactions (Fig. 2d). Further inter- 
actions involve Trp 24, Thr 28, Ile 32 and Gln 76 in helix 1 and 3 of 
VPpXsm) Which contact residues on blade 1 and 2 of DCAF1-CtD and the 
interspersing loop (Fig. 2e). Helix 3 of Vpxsn lies in a groove between 
blade 1 and 7 on the upper face of DCAF1-CtD. Interactions between 
the charged and hydrophobic side chains of Lys 77, Arg 70, Phe 80 and 
Met 81 along the length of helix 3 with residues in the intra-strand 
loops of blade 1 and 7 of DCAF-CtD, including the Glu-Asp-Glu loop, 
comprise the third region of the DCAF1-Vpx.m interface (Fig. 2f). 

Mutation of many of the residues at these interfaces have been shown 
previously to reduce viral infectivity, disrupt binding of Vpx proteins 
to DCAF1 and interfere with proteasomal degradation of SAMHD1 
(Extended Data Table 3). Examples include: the bridging residue of 
VPXsm» Trp 24 (ref. 10; Fig. 2c, e); VpXs, Gln 76 (ref. 11), which is hydro- 
gen bonded to Asn 1135 and Trp 1156 of DCAF1 (Fig. 2e); and VpXsmn 
Lys77 (ref. 12), which is integral to an extensive salt-bridge network 
that links residues in the DCAF1 Glu-Asp-Glu loop with Arg 70, Tyr 69 
and Tyr 66 in helix 3 of Vpx, (Fig. 2f). 
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Figure 1 | The SAMHD1-CtD-Vpx,,,-DCAF1-CtD complex. a, Schematic 
of proteins. CD, chromo domain; DD, dimerization domain; HD, His/Asp 
domain; SAM, sterile «-motif. Regions coloured grey (DCAF1, 1058-1396), red 
(SAMHD1, 582-626) and blue (Vpxsm, 1-112) were used for crystallization. 
b, Size-exclusion chromatograms (black) of equimolar mixtures of VpXom; 
DCAF1-CtD and SAMHD1(26-583) (left), SAMHD1(26-626) (middle) and 
SAMHD1(582-626) (right). Chromatograms from individual components 
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are also shown: VpXsm (blue), DCAF1-CtD (grey) and SAMHD1 (red). 
SDS-PAGE analyses of peaks are inset. Peak 1 (void volume) contains 
unspecific aggregates. Azgo nm» absorbance at 280 nm, V, is elution volume. 

c, Cartoon representation of the ternary complex. DCAF1-CtD is shown in a 
grey surface, B-propeller blades are numbered. SAMHD1-CtD is red, VpX.m is 
blue and a zinc ion is shown as a grey sphere. 


In the structure, SAMHD1-CtD makes salt bridges at both the Vpxsm 
and DCAF1-CtD interfaces through charged side chains in the tribasic 
motif. In vitro binding studies show that full-length SAMHD1 and 
SAMHDI1-CtD have comparable affinity for the DDB1-Vpx-DCAF1 
complex’’. However, to test whether SAMHD1-CtD alone is sufficient 
for recruitment by Vpx and to assess the contribution of the tribasic 
motif, an in vivo reporter assay was used. SAMHD1-CtD was fused to 
the C terminus of a tandem green fluorescent protein (GFP)-tagged 
nuclear localization signal (NLS) protein that localizes to the nucleus”. 
Human 293T cells transduced with the NLS-GFP-SAMHD1-CtD fusion 
(Fig. 3a) display GFP fluorescence in their nuclei (Fig. 3b). Delivery of 
Vpx by infection with SIVimac’?** (SIV isolated from Macaca mulatta) 
virus greatly reduces the number of GEP* cells (Fig. 3b, c). By contrast, 
infection with SIVinac?* virus or addition of the proteasomal inhib- 
itor MG132 with Vpx" virus does not reduce the population of GFP 
cells (Fig. 3b, c), indicating that loss of GFP fluorescence results from 
Vpx-mediated proteasomal degradation of NLS-GFP-SAMHD1-CtD. 
In addition, mutation of residues in the SAMHD1 tribasic motif (Arg 609- 
Arg 617-Lys 622) to alanine or glutamate also abolishes or severely 


Figure 2 | Intermolecular interfaces. a, Zn ion (grey sphere) and surrounding 
residues. Co-ordinating residues are displayed as sticks, co-ordinate bonds as 
green dashes. b, Cartoon representation of SAMHD1-CtD-Vpx,,,-DCAF1- 
CtD. DCAF1-Ctd is shown in grey, cylinders represent o-helices in SAMHD1- 
CtD (red) and Vpx,m (blue), intermolecular interfaces are highlighted by green 
boxes (I, II, IIL, IV). c-f, Views of the interface between SAMHD1-CtD-Vpx,- 
DCAF1-CtD (box I) and Vpxsm—-DCAF1-CtD (box I-IV). Residues 
contributing to the interface are shown as sticks, hydrogen-bonding 
interactions as dashed lines and residues important for Vpx function are 
highlighted with an asterisk. 
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Figure 3 |SAMHD1 C-terminal region. a, The 


a Degron: 
NLS =p NLS GER SAMHD I-CD WTO NLS-GFP-SAMHD1-CtD degron. WT, wild type; 
b mut., mutant. b, Microscopic images (20 
© 104 100 magnification) of uninfected and SIV Vpx*/~ 
3 fe infected 293T cells expressing NLS-GFP- 
- | Pee SAMHD1-CtD wild type or R617E mutant. 
2 60-4 807} _@ Vox wT ¢, Quantitative analysis of GFP-SAMHD1 degron 
Oo : . : . nt : 
& |-8- vpx* Re09A expression after infection with SIV Vpx ‘’~ viruses. 
a 404 rm 40-4 px + : 
6 | -@ Vpx* wT J Vpx- Re09A Percentages of GFP™ cells are plotted against SIV 
29 -| -@- Vex WT 20-|-A- Vpx* R609E virus titre. RT units, reverse transcriptase activity 
4 lee 7) -A- Vpx- REOGE (mU ml‘). MG132 was added where indicated. 
? a ar a a a Co 0 a ee eee eee oe Top left panel is representative of three 
independent experiments. d, Immunoblots of 293T 
10 oe cells co-transfected with SAMHD1 and Vpx 
soc a0 plasmids (Un., untransfected; —, empty vector; 
= ~O- Vpxt WT -O- Vpxt WT +, Vpx) probed with antibodies to SAMHD1 
2 97) Vpx WT 807) -@- vpx wT (top), 14-3-3 (middle) or Vpxyry-2 (bottom). 
ae & 4p | Vex" R617A 4g-]-EE Vpx* K622A Representative of two independent experiments. 
Sy ge O.  HP VPC RSIZA + -™ Vpx- K622A e, Restriction of HIV-1 infection by wild-type 
20 |—A- Vpx* R617E 20-|_~- Vpx* K622E SAMHD1 and mutants. Restriction is expressed as 
P ee Dee o be Vox k6226 a ratio of HIV-infected SAMHD1-transduced to 
a a ck - er a an ae a infected SAMHD1-negative cells (>3 independent 
SIV RT units (108) ae SIV RT units (103) experiments with different viral stocks). Data 
d - WT R6O9A R609E R617A RG17E K622A K622E SAMHD1 5 Eetesent the mean value and error bars are 
t=oEe Soe a Ses Se B 10 standard deviation. 
NmM- + - + - + -— + -— + — + =— + = + Vp = 
SAMHD1 ee es ww 8 
14-3-3 fee ol le ele el ee ee 3 0.5 
Vpx - a & | 
0.0 
PS oP AP 
g Cees 


diminishes the capacity of Vpx to induce degradation of NLS-GFP- 
SAMHD1-CtD (Fig. 3b, c). These data reveal that SAMHD1-CtD in 
isolation acts as a Vpx-dependent degron to induce the proteasomal 
turnover of a heterologous protein, and that disruption of the protein- 
protein interactions observed in the crystal structure prevent Vpx- 
mediated degradation. When the same mutations are introduced into 
SAMHD1 they also reduce the capacity of Vpx to induce degradation, 
albeit to varying degrees (Fig. 3d). However, all mutants show wild- 
type restriction of HIV-1, indicating that the CtD is not required for 
SAMHD1 anti-HIV-1 activity (Fig. 3e). 

To put the DCAF1-CtD-Vpx,,,-SAMHD1-CtD structure into the 
context of the E3 ligase, a molecular model of the entire CUL4A-DDB1- 
ROC1-DCAF-SAMHD1-Vpx,,, assembly was constructed by super- 
position of DCAF1-CtD-Vpx,,-SAMHD1-CtD onto existing structures 
in the Protein Data Bank (PDB). First, the B-propellers of DCAF1-CtD 
and the related substrate adaptor, DDB2, were aligned (Extended Data 
Fig. 2, inset), facilitating substitution of DCAF1-CtD for DDB2 in the 
existing DDB1-DDB2 structure’’. The DDB1-CUL4A-ROC1 interface 
is highly flexible’. Therefore, the DCAF1-CtD-Vpxs,-SAMHD1-CtD- 
DDB1 model was superposed onto the two most extreme conforma- 
tions available, allowing the range of orientations that CUL4A-ROC1 
can adopt with respect to VpXsm and SAMHD1 to be visualized (Extended 
Data Fig. 2). The model places Vpx,,,, and SAMHD1-CtD on the opposite 
face of the DCAF1-CtD disk from the DCAF1-DDB1-binding site, 
accessible to the RING domain of ROC1. Moreover, in both conforma- 
tions SAMHD1 is placed in the proximity of the ROC1 RING domain, 
ideally located for ubiquitin transfer. Notably, regions of SAMHD1 prox- 
imal to the bound SAMHD-CtD are required for catalytic activation/ 
tetramerization’” and association with Vpx-DCAF1-DDB1 inhibits 
SAMHD1 catalysis’, suggesting that recruitment to the CUL4A-DDB1- 
ROC] complex might additionally downregulate SAMHD1 activity 
through tetramer disassembly. 

Reprogramming of the CUL4-DDB1-ROCI E3 ubiquitin ligase is 
also used by paramyxovirus and hepatitis B virus to subvert the cellular 
antiviral response. These viruses usurp the interaction of DDB1 with 
DCAFI by installing the viral substrate recruitment factors V or X in 
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its place'*””. In lentiviruses, a different strategy has evolved in which 
the substrate recruitment factor DCAF1 is itself adapted by association 
with accessory proteins to create a new binding pocket, in the case of 
VpXsm and Vpxpry-2 to recruit SAMHD1 through its C-terminal region 
(Extended Data Fig. 3). Notably, the SAMHD1 tribasic motif is conserved 
among primates but is absent in species that are not HIV/SIV hosts 
(Fig. 4a). Similarly, the N-terminal Vpx,,,, sequence ,gPPGNSGEET}, 
containing Glu 15 and Glu 16 that make salt bridges with Arg 609 and 
Arg 617, is conserved among all Vpx proteins that target human SAMHD1 
for degradation’ (Fig. 4b), suggesting that the complementarity between 
these motifs is a driver of the specificity of the SAMHD1-CtD-Vpx 
interaction. In Vpx proteins that do not induce degradation of human 
SAMHDI1 (refs 7, 20), present in viruses isolated from red-capped 
mangabey (C. torquatus) (Vpx,-m) and mandrill (Mandrillus sphinx) 
(VpXmna-2)» 9P PGNSGEET 7 is not conserved (Fig. 4b). However, these 
Vpx proteins still induce degradation of SAMHD1 but do so by tar- 
geting it to DCAF1 through sequences located towards the SAMHD1 
N terminus”. 

In some SIVs, the evolutionarily related accessory protein Vpr recruits 
SAMHD1 for degradation’. By contrast, Vprury-. still associates with 
DCAF1 and the CUL4A-DDB1-ROCI E3 ligase but results in cell cycle 
arrest at G2, probably through recruitment of an unidentified cellular 
factor to the E3 ligase*’”*. In the complex, we identified a structural 
zinc-binding site in Vpx,,, (Fig. 2a) that is conserved in both Vpx and 
Vpr proteins (Fig. 4b). Mutation of His 71 in Vprpry-1, the equivalent 
of the Vpx;m zinc-coordinating His 82, abrogates DDB1-DCAF1 bind- 
ing and Vpr-induced cell cycle arrest”*. Furthermore, most of the other 
conserved residues map to the DCAF1 interface (Fig. 4c) and mutation 
of two of these (Vpr Gln 65 and Trp 18, the equivalents of Vpx Gln 76 and 
Trp 24) also results in loss of Vpr function”? (Extended Data Table 3). 
These observations suggest a strong structural conservation between 
these related accessory proteins. Moreover, given that Vpr-induced cell 
cycle arrest is also mediated through association with DCAF1 and the 
actions of the CUL4A-DDB1-ROC1 E3 ligase”’™, it is likely that both 
factors use a similar mechanism to target cellular proteins to the CUL4A 
complex (Extended Data Fig. 3). Consequently, although the cellular 
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Figure 4 | Species specificity of the SAMHD1-Vpx interaction. a, Sequence 
alignment of the C termini of SAMHD1. One-hundred per cent type-conserved 
residues are highlighted red, 60% are highlighted cyan. Residue numbers refer 
to human SAMHD1-CtD. Red stars indicate side chains involved in Vpx,.,.- 
DCAF1-CtD binding, red dots indicate main-chain interactions. Black dots 
indicate residues whose mutation impairs DDB1-DCAF1-Vpx binding’. 
Species above the red line are HIV-2/SIV hosts. Phos, phosphorylation site. 
Ca, Cercocebus atys; Cf, Canis familiaris; Cj, Callithrix jacchus; Ct, Cercocebus 
torquatus; Dd, Dictyostelium discoideum; Dr, Danio rerio; Gg, Gallus gallus; 
Hs, Homo sapiens; Mam, Macaca mulatta, Mum, Mus musculus; Pt, Pan 
troglodytes; Rn, Rattus norvegicus; X1, Xenopus laevis. See Methods for sequence 


target(s) of Vpr are unknown, the ternary complex presented here provides 
a structural model for the design of therapeutic agents that target the 
Vpx- and Vpr-DCAF1 interaction. 


METHODS SUMMARY 


Strep-II-tagged SAMHD1 residues 582-626 (SAMHD1-CtD) and glutathione 
S-transferase (GST)-tagged Vpxsm were expressed in Escherichia coli strain Rosetta 2 
(DE3) and purified by affinity and gel-filtration chromatography. His-tagged DCAF1, 
residues 1058-1396 (DCAF1-CtD), was expressed in insect cells and purified using 
Ni-NTA Sepharose and gel-filtration chromatography. For structure determina- 
tion, selenium was incorporated into Vpx,,, and DCAF1-CtD by supplementing 
culture media with seleno-methionine. The ternary complex was assembled by incub- 
ating the components in a molar ratio of DCAF1-CtD:VpXgn3SAMHD1-CtD = 1:1.5:1.5 
overnight on ice and purified by gel-filtration chromatography. Details of protein 
crystallization, data processing, structure determination and refinement are provided 
in Methods. Structure figures were prepared in PyMOL (http://pymol.sourceforge. 
net/). The degradation of tandem NLS-GFP fused to SAMHD1-CtD stably expressed 
in 293T cells was analysed by fluorescence microscopy and flow cytometry. SAMHD1 
degradation by Vpx was assessed in 293T cells by transfection and immunoblot- 
ting. Infectivity of HIV-1 in U937 cells in the presence of SAMHD1 mutants was 
determined using two-colour flow cytometry. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 

Cloning, protein expression and purification. The DNA sequence coding for 
human SAMHD1 residues Q582—M626 (SAMHD1-CtD) was amplified by PCR 
from a cDNA template using the oligonucleotide primer sequences: forward, GGC 
GGATCCTCAGGATGGCGATGTTATAGCCCC; reverse, GGCGCGGCCGCT 
CATCACATTGGGTCATCTTTAAAAAGCTGG. The PCR product was gel puri- 
fied and ligated into the pET52b plasmid (Merck Millipore) using standard restric- 
tion enzyme cloning to generate an N-terminal Strep-II-tagged fusion protein. 
N-terminally GST-tagged sooty mangabey Vpx residues M1-A112 (Vpx;,,) in the 
pET49b plasmid (Merck Millipore) was a gift from D. Goldstone. Human DCAF1 
residues A1058-E1396 (DCAF1-CtD) were amplified by PCR from a cDNA tem- 
plate using the oligonucleotide primer sequences: forward, GGCCCATGGCATC 
ATTTCCAAAGTATGGAGGGG; reverse, GGCGAGCTCCTCTGCCAGACGC 
TGCCTGCC. The PCR product was gel purified and ligated into the pTriEx-6 
plasmid (Merck Millipore) using standard restriction enzyme cloning to generate a 
C-terminal 10 His-tagged fusion protein. Insert sequences were verified by DNA 
sequencing. 

SAMHD1-CtD and GST-Vpx, were expressed in the E. coli strain Rosetta 2 
(DE3) (Merck Millipore). Bacterial cultures were grown in an incubator shaker at 
37 °C. Protein expression was induced by the addition of 0.1 mM isopropyl-B-p- 
thiogalactoside (IPTG) at Agoonm = 0.5. Afterwards, cultures were cooled to 18 °C 
and grown for a further 20h. To produce seleno-methionine (SeMet)-labelled 
VPpXsnv Cells were grown to A¢oo nm = 0.5 at 37 °C in M9 minimal medium. Then, 
an amino acid supplement (L-lysine, L-phenylalanine, L-threonine to a final con- 
centration of 100 mgl”', L-isoleucine, L-leucine, L-valine and L-SeMet to a final 
concentration of 40 mg1', respectively) was added to inhibit endogenous methio- 
nine biosynthesis and start SeMet incorporation”’. Fifteen minutes after addition, 
the culture was cooled to 18 °C. Protein expression was induced by the addition of 
0.5 mM IPTG, and cells were grown for a further 20h. 

Cultures were centrifuged for 20 min at 4,500g and 4 °C. Cell pellets were resus- 
pended in 30 ml lysis buffer (50 mM Tris-Hcl pH 7.8, 500 mM NaCl, 4mM MgCh, 
0.5M TCEP, 1X EDTA-free mini complete protease inhibitors (Roche), 0.1 U ml ‘ 
Benzonase (Novagen) per pellet of 1 litre bacteria culture. Cells were lysed in an 
EmulsiFlex-C5 (Avestin). The lysate was cleared by centrifugation for 1 h at 48,000g¢ 
and 4 °C. All further purification steps were performed at 4 °C or on ice. The cleared 
lysates were applied to 10 ml StrepTactin (IBA, for SAMHD1-CtD) or to 10 ml 
Glutathione-Sepharose (GE Healthcare, for GST-Vpx.m) columns. Columns were 
washed with 600 ml wash buffer (50 mM Tris-HCl pH 7.8, 500 mM NaCl, 4mM 
MgCh, 0.5 mM TCEP). Column-immobilized GST-Vpx, was additionally washed 
with 250 ml of wash buffer supplemented with 5mM ATP, followed by 250 ml 
wash buffer supplemented with 1% CHAPS. SAMHD1-CtD was eluted from the 
column with 25 mM ammonium bicarbonate pH 7.5, 2.5 mM desthiobiotin. The 
elution peak was lyophilized, resuspended in 500 ll of 25 mM ammonium bicar- 
bonate pH 7.5 and applied to a Superdex 75 16/60 gel-filtration column (GE 
Healthcare) equilibrated with 25 mM ammonium bicarbonate pH 7.5. The peak 
fractions were pooled, lyophilized and resuspended in 10 mM Bis-Tris propane pH 
8.5, 150 mM NaCl, 4mM MgCl, 0.5 M TCEP. Small aliquots at a concentration of 
approximately 2 mM were flash frozen in liquid nitrogen and stored at —80 °C. 
GST-Vpx. was eluted from the column with wash buffer supplemented with 20 mM 
glutathione. The elution peak was concentrated to 5 ml and incubated overnight 
with 1 mg GST-3C protease. Cleaved Vpxsm was further purified on a Superdex 
200 26/60 gel-filtration column (GE Healthcare) equilibrated with 10 mM Tris- 
HCl pH 7.8, 150 mM NaCl, 4mM MgCh, 0.5 mM TCEP. Peak fractions contain- 
ing Vpx,m were concentrated to approximately 20 mg ml‘ and flash frozen in 
liquid nitrogen in small aliquots for further storage at — 80 °C. Seleno-methionine- 
substituted Vpx,,, was purified in the same way. 

For the production of DCAF1-CtD, recombinant baculovirus was generated by 
co-transfecting Sf9 cells with pTriEx-DCAF1-CtD and linearized BAC10:1629x6 
(ref. 27). Sf9 cells were cultured in SF900 II serum-free medium (Invitrogen) at 
28 °C. Ina typical preparation, 2 litres of Sf9 cells at 2 X 10° cells ml’ density were 
infected with 4 ml of high-titre DCAF1-CtD virus for 48h. For structure deter- 
mination, selenium was incorporated into DCAF1-CtD by supplementing 921A 
series medium (Expression Systems, LLC) with 50 mg] ' seleno-methionine. 

Sf9 cultures were centrifuged for 20 min at 4,500g and 4 °C. Pellets were resus- 
pended in 30 ml lysis buffer (50mM Tris-HCl pH 7.8, 500mM NaCl, 4mM 
MgCl, 30 mM imidazole-HCl pH 7.8, 0.5 mM TCEP, 1X EDTA-free mini com- 
plete protease inhibitors (Roche), 0.1 U ml Benzonase (Novagen)) per pellet of 1 
litre culture. Cells were lysed in an EmulsiFlex-C5 (Avestin). The lysate was cleared 
by centrifugation for 1h at 48,000g and 4 °C. All further purification steps were 
performed at 4°C or on ice. The cleared lysate was applied to a 1 ml Ni-NTA 
Sepharose column (GE Healthcare). The column was washed with 500 ml wash 
buffer (50 mM Tris-HCl pH 7.8, 500 mM NaCl, 4mM MgCl, 30 mM imidazole- 
HCl pH 7.8, 0.5 mM TCEP). Protein was eluted with wash buffer supplemented 
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with 300 mM imidazole pH 7.8. Peak fractions were pooled, concentrated to 2 ml 
and applied on a Superdex 200 16/60 gel-filtration column (GE Healthcare) equi- 
librated with 10 mM Tris-HCl pH 7.8, 150 mM NaCl, 4 mM MgCh, 0.5 mM TCEP. 
Peak fractions containing DCAF1-CtD were pooled, concentrated to approxi- 
mately 20 mg ml’ and flash frozen in liquid nitrogen in small aliquots for further 
storage at — 80 °C. Seleno-methionine-substituted DCAF1-CtD was purified in the 
same way. 

Assembly of the ternary complex. Initially, the complex was prepared by mixing 
100 ppg of DCAF1-CtD, Vpx,., and SAMHD1-CtD in 10 mM Bis-Tris propane pH 
8.5, 150 mM NaCl, 4 mM MgCl, 0.5 mM TCEP, followed by incubation overnight 
on ice. Complex formation was assessed by applying the mixture on an analytical 
Superdex 75 10 300 GL gel-filtration column (GE Healthcare), equilibrated with 
10 mM Bis-Tris propane pH 8.5, 150 mM NaCl, 4mM MgCh, 0.5 mM TCEP and 
SDS-PAGE analysis (molecular weight of markers 97, 66.3, 55.4, 36.5, 31.0, 21.5, 
14.4, 6.0. and 3.5 kDa). After optimization, for large-scale preparation of the complex, 
DCAF1-CtD, Vpx,, and SAMHD1-CtD were mixed in a molar ratio of 1:1.5:1.5 in 
0.1 M Bis-Tris propane pH 8.5, 150 mM NaCl, 4mM MgCl, 0.5 mM TCEP and 
incubated overnight on ice. The mixture was applied to a Superdex 75 16/60 gel- 
filtration column (GE Healthcare), equilibrated with 10 mM Bis-Tris propane pH 
8.5, 150mM NaCl, 4mM MgCh, 0.5mM TCEP. Peak fractions containing all 
components were pooled, concentrated to approximately 5 mg ml‘ and directly 
used for crystallization experiments, after addition of an equimolar amount of 
SAMHD1-CtD. 

Crystallization and structure determination. The native DCAF1-CtD-Vpx,,,.— 
SAMHD1-CtD complex was crystallized by vapour diffusion using an Oryx crys- 
tallization robot (Douglas Instruments). Crystals were obtained from 0.1 1l droplets 
containing an equal volume of 2.5mgml ' protein complex in 10mM Bis-Tris 
propane pH 8.5, 150mM NaCl, 4mM MgCl, 0.5mM TCEP mixed with 0.2 M 
magnesium chloride, 0.1 M HEPES pH 7.5, 15% PEG 400. For data collection, 
crystals were adjusted to 25% PEG 400 and cryo-cooled in liquid nitrogen. Native 
crystals diffracted up to 2.5 A resolution on beamline 103 at the Diamond Light 
Source, and belong to the space group P2,2;2; with cell dimensions of a = 73.79, 
b = 82.03, c = 113.29, with a single copy of the complex in the asymmetric unit. 
Crystals of the complex containing seleno-methionine-substituted DCAF1-CtD 
and VpX.m were grown using the vapour diffusion method by mixing 1 jl complex 
at 5mg ml! with 1 pl reservoir solution containing 0.4M magnesium sulphate, 
0.1M MES pH 6.5. Crystals were adjusted to 25% glycerol and cryo-cooled in 
liquid nitrogen. A SAD data set was collected on beamline 104 at the Diamond 
Light Source at a wavelength of 0.97972 A, corresponding to the anomalous f"’ 
peak wavelength for selenium determined from an X-ray fluorescence scan. The 
crystal diffracted up to 3.5 A resolution and was nearly isomorphous to the native 
crystal with the same space group and unit cell dimensions of a = 74.25, b = 82.88, 
c = 115.56. Data were reduced using the XDS suite”*. An initial set of five selenium 
sites was found using the programs SHELXC and SHELXD”. These sites were used 
as input for the program autoSHARP”, which added two more sites and per- 
formed density modification, leading to an interpretable electron density map. 
A nearly complete model for Vpxsm and a partial polyalanine trace of DCAF1- 
CtD were located in this initial map using the buccaneer chain-tracing program”’. 
Completion of the polyalanine trace of DCAF1-CtD and placement of the protein 
backbone of SAMHD1-CtD was then undertaken manually in Coot*. A round of 
refinement against native data recorded to 2.5 A produced an improved map that 
allowed further building of the SAMHD1-CtD and DCAFI1-CtD side chains. 
Subsequent incorporation of a Zn’* ion and ligands in Coot combined with 
positional, real-space, individual B-factor and TLS refinement in phenix.refine** 
produced a final model for residues 1073-1314, 1328-1392 of DCAF1-CtD, 5-90, 
100-111 of Vpx,,, and 606-624 of SAMHD1-CtD with R/Rgee factors of 17.8%/ 
21.6%. In the model, 97.1% of residues have backbone dihedral angles in the 
favoured region of the Ramachandran plot, 2.66% fall in the allowed regions and 
0.24% are outliers. Details of data collection and refinement statistics are presented 
in Extended Data Table 1. 

Multiple sequence alignment. Amino acid sequences were aligned using the ClustalW 
server and adjusted manually. Uniprot accessions are as follows. SAMHD1: Homo 
sapiens, Q9Y3Z3; Pan troglodytes, H6WE97; Cercocebus atys, HOWEA6; Cercocebus 
torquatus, HOWEA7; Macaca mulatta, F7CA64; Callithrix jacchus, F7IGP7; Canis 
familiaris, E2QTR2; Mus musculus, Q60710; Rattus norvegicus, D3Z898; Gallus 
gallus, Q5ZJL9; Xenopus laevis, Q6INN8; Danio rerio, Q502K2; Dictyostelium discoideum, 
BO0G107. Vpx: Cercocebus atys (isolate sm), P19508; Macaca mulatta (isolate mac), 
P05917; Cercocebus torquatus (isolate rcm), NCBI GenBank HM803689; Mandrillus 
sphinx (isolate mnd-2), NCBI reference sequence NP_758889; HIV-2A (isolate rod), 
P06939; HIV-2B (isolate eho), Q89721. Vpr: HIV-2A (isolate rod), P06938; HIV- 
2B (isolate eho), POC1P6; HIV-1MA (isolate mal), P05955; HIV-1MB (isolate yu-2), 
P35967. 
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Modelling of the cullin/RING ubiquitin ligase assembly. Structural superposi- 
tion was carried out using SSM™ implemented in PDBefold (http://www.ebi.ac.uk/ 
msd-srv/ssm/). DCAF1-CtD was first superposed with the DDB2 B-propeller in 
the DDB1-DDB2? structure (PDB accession 3EI3) such that the N-terminal resi- 
dues and first B-strand of DCAF1-CtD align with the equivalent sequence just C 
terminal to the helical DDB1-binding element of DDB2, termed the H-box. The 
second f-propeller domain of DDB1 (BPB), which serves as anchor for the CUL4 
scaffold (wheat half-transparent surface), is mobile with respect to the DDB1 BPA 
and BPC. The two most extreme BPB conformations available in the PDB database 
were used to model the rotational range of the CUL4 arm with respect to the DDB1 
BPB (PDB accessions 2HYE, to the left, and 317H, to the right). 

Restriction assay. SAMHD1 wild-type sequence was inserted into pLGatewayleYFP*° 
and mutations created by PCR-based site-directed mutagenesis. MoMLV-based 
YFP vectors were made by co-transfecting pVSV-G (ref. 36), pKB4 (ref. 37) and 
pLgatewaySN_SAMHD1 (wild type or mutants) into 293T cells, harvesting 48h 
after transfection. HIV-1-GFP was made as described earlier by co-transfecting 
pVSV-G, p8.91 (ref. 38) and pCSGW (ref. 39). U937 cells*® were maintained in 
RPMI plus [1] glutamine (GIBCO) with 10% fetal calf serum (Biosera), penicillin 
and streptomycin. Cells (1 X 10°) were transduced by spinoculation at 1,700 r.p.m. 
for 90 min with 0.5 ml neat virus in the presence of 1 pg ml’ polybrene. Cells were 
differentiated by addition of 100nM phorbol myristate acetate for 72h and 
infected with HIV-1-GFP. Restriction was assessed by two-colour flow cytometry 
after 72h. 

Transfection and immunoblotting. SAMHD1 and HIV-2 Vpx sequences 
were amplified from pLgatewaySN_SAMHD1 and pIRES2-EGFP-Vpx (gift from 
M. Stevenson), respectively, and cloned into pCMS28 (ref. 41). Point mutations 
were created by PCR-based site-directed mutagenesis. 293T cells were co-transfected 
with 2 1g each of pCMS28-SAMHDI1 and -Vpx, wild type, mutant proteins or 
empty vector. Cells were harvested 24h after transfection and analysed by SDS- 
PAGE followed by immunoblotting with anti-SAMHD1 3295 (generated in house), 
anti-HIV-2 Vpx (hybridoma 6D2.6 supernatant, NIH AIDS Reagent Program 
2739) and anti-14-3-3 (C-16, Santa Cruz sc-731). 

Degron assay. A reporter construct comprising two copies of GFP with NLS fused 
to residues 600-626 of SAMHDI was synthesized (GenArt; NLS-GFP-SAMHD1- 
CtD; Fig. 3a). NLS-GFP-SAMHD1-CtD was inserted into pCMS28 using BglII 
and EcoRI. All mutants were created by PCR-based site-directed mutagenesis. 
Puromycin virus-like particles were generated by co-transfection of pVSVG, pKB4 
and pCMS28-NLS-GFP-SAMHD1-CtD wild type or mutant as above. The stable 
cell lines were made by transduction of 293T cells followed by puromycin selection. 
VLPs SIV Vpx* and Vpx” were generated by co-transfection of pVSVG and a SIV 
Gag-Pol expression plasmid pSIV3~ (ref. 42) or pSIV3" vpx— (gift from C. Goujon 
and A. Cimarelli). Viral titres were quantified using a modified ELISA for reverse 


transcriptase activity (Cavidi). 293T cells stably expressing SAMHD1-GFP wild 
type or mutant were seeded at 5 X 10* cells per well in a 24-well plate 1 day before 
infection. Cells were infected with twofold serial dilutions of replication defective 
SIV Vpx" virus in the presence of 1 1g ml polybrene. For each experiment, SIV 
Vpx VLPs and proteasome inhibitor MG132 (25.2 tM) were included as negative 
controls. After 48 h, cells were harvested and the percentage of GFP-positive cells 
was determined by flow cytometry using a FACSVerse analyser (BD Biosciences). 
In parallel, GFP expression was analysed by microscopy using an inverted fluor- 
escent microscope (LEICA). 
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Extended Data Figure 1 | Experimental electron density. Experimental traces of the final refined protein structure are shown in green ribbon 
electron density observed after solvent flattening for DCAF1, Vpx.m and representation. 
SAMHD1 is shown as light blue wireframe, contoured at 1o. The backbone Ca 
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~50A 


~150° rotation of CUL4A arm possible 
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Extended Data Figure 2 | Model of the hijacked CUL4A-DDB1-ROCI E3 
ubiquitin ligase complex. Vpx,,, (blue) bound to the substrate-specificity 
module DCAF1-CtD (grey) is shown in cartoon representation as in Fig. Ic. 
SAMHD1-CtD is shown as red spheres. The DDB1 adaptor is shown as green 
cartoon. The inset to the right shows the superposition of the DDB2 helical 
hairpin (H-box, orange), which inserts into the binding groove created by 
DDB1 B-propellers 1 and 3 (BPA, BPC), and the N terminus of DCAF1-CtD 
presented in this study. The CUL4A scaffold is represented as an orange semi- 
transparent surface, the ROC1 RING module as purple spheres. Owing to the 
conformational freedom of the DDB1-—CUL4A connection, the two most 
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extreme conformations of CUL4A with respect to DDB1 available in the PDB 
were modelled. See Methods for modelling procedures and PDB accessions. 
The model clearly shows that in both extreme CUL4 conformations, the ROC1 
RING finger (purple spheres) is well positioned to reach the SAMHD1 protein, 
which would be attached at the N-terminal end of the SAMHD1-CtD. The 
SAMHD1 globular fold is probably mobile with respect to the fixed position of 
SAMHD1-CtD owing to the flexibility of the sequence stretch between 
SAMHDI residues 583 (the last ordered residue of PDB accession 3U1N°) and 
606 (the first ordered residue of SAMHD1-CtD presented here). 
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Extended Data Figure 3 | Analogous mechanism of restriction factor 
counteraction in SIV/HIV-2 and HIV-1 Vpx and Vpr. SAMHD1 provides a 
potent post-entry block against immunodeficiency viruses in non-cycling cells. 
Its dNTP-triphosphohydrolase activity lowers the cellular dNTP pool, 
preventing viral reverse transcription. HIV-2/SIVs use their Vpx and Vpr 
accessory proteins to modify the host cell’s CUL4A-DDB1-DCAF1 ubiquitin 
ligase specificity towards SAMHD1, resulting in its proteasomal degradation 
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and ultimately raising dNTP levels, making the cells permissive to viral 
replication. Sequence similarity and comparative functional analysis suggest 
that the ancestral HIV-1 accessory protein Vpr uses a similar mechanism to 
exploit the CUL4A-DDB1-DCAF! system to induce proteasomal degradation 
ofan as yet undiscovered cellular factor whose absence causes cell cycle arrest in 
the G2 phase, promoting viral replication and pathogenesis in vivo. 
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Extended Data Table 1 | Data collection and refinement statistics 


rm.s.d., root mean squared deviation. 


Data collection 
Space group 
Cell dimensions 

a, b, c (A) 

a, B, y (°) 
Resolution (A) 
Reym OF Rmerge 
1/ o(I) 
Completeness (%) 


Redundancy 


FOM (acentric/centric) 


Anomalous phasing power 


Refinement 
Resolution (A) 
No. reflections 
Ryork ! Ree 
No. atoms 
Protein 
Ligand PG4 
Ligand Zn 
Water 
B-factors (A?) 
Protein 
Ligand PG4 
Ligand Zn 
Water 
r.m.s.d. 
Bond lengths (A) 
Bond angles (°) 


*Values in parentheses are for highest-resolution shell. 


Se peak 
P2,2,2, 


74.25, 82.88, 115.56 
90, 90, 90 

30-3.5 (3.71-3.5) 
5.3 (9.9) 

20.0 (11.5) 

98.6 (96) 

3.5 (3.5) 


0.27/0.06 
0.99 
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Native 
P2,2,2, 


73.79, 82.03, 113.29 
90, 90, 90 

30-2.46 (2.61-2.46) 
9.1 (52.9) 

15.9 (3.6) 

99.3 (97.8) 

6.5 (6.4) 


30-2.47 
25518 
17.6/21.6 


3407 
26 

1 

25 
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Extended Data Table 2 | Intermolecular interactions 
(A) Vpx-DCAF1 (1987A?) 


Vpx Region Contact type DCAF1 Region 
“E6 Nt HB 81102,R1104yc.R1106y. WD401 

18 Nt HI F1107,L1119 WD40 1 

P9uc Nt HB Y1131 WD40 1 

N12 Nt HB $1130mc, N1132mc WD40 1 

$13 Nt HB $1130, N1132 WD40 1 

W24 ai HB N1135 WD40 1-2 

T28 ai HB N1135 WD40 1-2 

132 at HI W1156 WD40 2 

A35 at HI W1156 WD40 2 

L44 a2 HI M1375 WD40 7 

L48 a2 HI M1375 WD40 7 

R51 a2 HB D1376yc WD40 7 

W56 a2 HI L1378 WD40 7 

Y66 a3 HB D1092 WD40 7-1 

Y69 a3 HB E1091 WD40 7-1 

R70 a3 SB E1093 WD40 7-1 

L74 a3 HI M1375, L1378 WD40 7 

Q76 a3 HB N1135, W1156 WD40 1-2, 2 

K77 a3 SB, HB E1093, M1380mc WD40 7-1, 7 

A78 a3 HI M1375 WD40 7-1 

F80 a3 HI F1330, F1355 WD40 6, 7 

M81 a3 HI F1330, F1355 WD40 6, 7 

K84 a3 HI, HB L1313, F1330, F1355yc WD40 6, 7 

K85mc a3 HB K1224, R1225 WD40 4 

G86uc a3 HB K1224 WD40 4 

(B) Vpx-SAMHD1 (703A?) (C) DCAF1-SAMHD1 (207A?) 
“Vpx Region Contacttype SAMHD1 “DCAF1 Region‘ Contacttype = SAMHD1 

G14uc Nt HB L610mc, R611Mc D1092 WD407-1 SB K622 

E15 Nt SB R609 

E16 Nt SB R617 

18mc =a HB N606mc, T608mc 

G19vce at HB N606nc 

A21 at HI R617, L620 

F22 at HI L620, F621 

W24 at HI R617, F621 

L25 ai HI F621 

M62 a2-a03 HI K622 

S63 a2-a3_ HB F621Mc 

Y66 a3 HB, HI F621mc, K622 

Y69 a3 HI V618, F621 


HB, hydrogen bond; HI, hydrophobic interaction; SB, salt bridge. Bi-functional residues are highlighted in red. 
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Extended Data Table 3 | Vpx and Vpr mutations 


SIlVsm Vpx HIV-1 Vpr SIVsm Vpx phenotype HIV-1 Vpr Reference 

residue residue henotype 

N12 n.a. N12A/no SAMHD1 degradation n.a. PMID 22362772 

$13, T17, n.a. $13A,T17A,T28A/no SAMHD1 n.a. PMID 23076149 

T28 degradation, reduced HIV-1 
infectivity 

E15 n.a. E15A/no SAMHD1 degradation n.a. PMID 22362772 

E16 n.a. E16A/no SAMHD1 degradation n.a. PMID 22362772 

T17 n.a. T17A/no SAMHD1 degradation n.a. PMID 22362772 

W24 W18 W24A/no SAMHD1 degradation W18A/reduced PMID 22776683 

cell cycle arrest PMID 21949789 

132 n.a. 132S/no DCAF 1 binding, no n.a. PMID 22776683 
SAMHD1 degradation 

H39 (Zn n.a. H39A/no DCAF1 binding, reduced _in.a. PMID 18829761 

finger) SAMHD1 degradation, reduced PMID 22776683 
HIV-1 infectivity 

Q76 Q65 Q76R/no DCAF1 binding, reduced Q65R/no DCAF1 PMID 21720370 
HIV-2 infectivity binding, no cell PMID 19264781 
Q76A/no DCAF 1 binding, reduced __ cycle arrest PMID 19923175 
HIV-1 infectivity, reduced SIV PMID 18464893 
infectivity, SIV RT activity PMID 17314515 
compromised 

K77 na. K77A/ no DCAF 1 binding, reduced i n.a. PMID 19264781 
HIV-2 infectivity 

F80 n.a. F80A/no DCAF1 binding, reduced n.a. PMID 18464893 
SIV infectivity, RT activity 
compromised 

H82 (Zn H71 n.a. H71R/no DDB1- PMID 17609381 

finger) DCAF1 binding, 


no cell cycle 
arrest 
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The increasing demands placed on natural resources for fuel and 
food production require that we explore the use of efficient, sus- 
tainable feedstocks such as brown macroalgae. The full potential of 
brown macroalgae as feedstocks for commercial-scale fuel ethanol 
production, however, requires extensive re-engineering of the algi- 
nate and mannitol catabolic pathways’ in the standard industrial 
microbe Saccharomyces cerevisiae. Here we present the discovery 
of an alginate monomer (4-deoxy-L-erythro-5-hexoseulose uronate, 
or DEHU) transporter from the alginolytic eukaryote Asteromyces 
cruciatus*. The genomic integration and overexpression of the gene 
encoding this transporter, together with the necessary bacterial algi- 
nate and deregulated native mannitol catabolism genes, conferred 
the ability of an S. cerevisiae strain to efficiently metabolize DEHU 
and mannitol. When this platform was further adapted to grow on 
mannitol and DEHU under anaerobic conditions, it was capable of 
ethanol fermentation from mannitol and DEHU, achieving titres of 
4.6% (v/v) (36.2 gl 1) and yields up to 83% of the maximum theore- 
tical yield from consumed sugars. These results show that all major 
sugars in brown macroalgae can be used as feedstocks for biofuels 
and value-added renewable chemicals in a manner that is compa- 
rable to traditional arable-land-based feedstocks. 

The United Nations predicts that the world population will grow to 
9.6 billion people by 2050 (ref. 5). According to the World Energy 
Outlook 2012 (ref. 6), the global demand for renewable energy produc- 
tion is anticipated to increase markedly; ethanol production is projected 
to increase 3.4 times by 2035. In 2010, approximately 40% of the US 
corn’ and 55% of the Brazilian sugarcane* collected were already used to 
produce a majority (86%) of the world’s total ethanol’. Meanwhile, the 
Food and Agriculture Organization projects that overall food produc- 
tion must increase by 70% between 2005 and 2050 (ref. 10). Because the 
arable land space is projected to increase by less than 5% by 2050, over 
90% of the increase in crop production for both food and energy must be 
accomplished through yield improvements and increased farming inten- 
sity, causing significant stress on water resources"! and fertiliser use’. 
Thus, more efficient and sustainable sources of biomass will be critical. 

Brown macroalgae exhibit several features of an ideal feedstock that 
can complement the increased global demand on energy and food pro- 
duction. The cultivation of this biomass does not require arable land, 
fresh water or fertiliser, circumventing adverse impacts on food supplies 
and resource availability. Large-scale cultivation of brown macroalgae 
is already being practiced in several countries, yielding over 70 million 
metric tons per year in 2006 (ref. 3). Because brown macroalgae do not 
contain lignin, simple biorefinery processes such as milling, leaching 
and extraction can separate the sugars for conversion into biofuels and 


renewable chemicals'*. Additionally, valuable materials, such as protein 
meal for animal feed and potash fertiliser for crop production, can be 
separated to support sustainable food production’ (see Supplementary 
Discussion for a brown macroalgae biorefinery description). 

The most abundant sugars in brown macroalgae are alginate, man- 
nitol and glucan (present as laminarin and cellulose). Conventional 
industrial microbes can use mannitol and hydrolysed glucan’*. However, 
the full potential of biofuel and renewable chemical production from 
brown macroalgae cannot be realized unless alginate is co-fermented’. 
Alginate composes 30-60% of the total sugars in brown macroalgae”’, 
so the inability of industrial microbes to catabolise alginate results in a 
substantial loss of product yield. Additionally, the excess reducing equi- 
valents produced by ethanol fermentation from mannitol can be redox- 
balanced by alginate catabolism, which would otherwise require electron 
shunts such as oxygen’ (Supplementary Fig. 1). Thus, enabling the co- 
fermentation of alginate and mannitol in an existing industrial microbe 
is a key criterion for the economic and efficient use of the sugars derived 
from brown macroalgae’’. 

Alginate is a linear block copolymer of two uronates, B-D-mannuronate 
and o-L-guluronate, linked via a 1,4-glycosidic bond and arranged in 
varying sequences'’. We previously characterized alginate metabolism 
in the marine bacterium Vibrio splendidus 12B01 (ref. 1). Reconstruc- 
tion of this pathway in Escherichia coli enabled the co-fermentation of 
alginate, mannitol and glucan into ethanol at a yield of 0.281 g ethanol 
per g dry brown macroalgae’”. Although this E. coli system provides a 
compelling proof-of-principle example that may be suitable for the pro- 
duction of higher value renewable chemicals such as 1,3-propanediol”® 
and 1,4-butanediol'®, Saccharomyces cerevisiae is a more amenable host 
for commercial-scale fuel ethanol production and is the standard microbe 
in the bioethanol industry’’. Enabling ethanol production from the 
brown macroalgae sugars through a S. cerevisiae platform requires 
the engineering of both the mannitol and alginate catabolic pathways 
(Supplementary Fig. 1). Although S. cerevisiae has a native mannitol 
metabolic pathway*”’, control of its expression is cryptic and requires 
deregulation. Genes encoding the alginate metabolism pathway in 
bacteria’ also need to be expressed in S. cerevisiae. A membrane trans- 
port system must be identified and isolated from an alginolytic eukar- 
yote such as A. cruciatus*. Finally, the efficiencies of the mannitol and 
alginate metabolism pathways must be synchronised for redox control. 

Our preliminary growth analyses and characterization of alginate/ 
oligoalginate degradation profiles (Supplementary Figs 2 and 3) sug- 
gest that A. cruciatus is capable of DEHU uptake. Transporter activity 
can be efficiently evaluated by a strain that grows on the substrate of 
interest, so we constructed two S. cerevisiae screening strains with the 
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ability to digest either DEHU (BAL2193) or oligoalginate (BAL2438; we 
did not preclude the possibility of identifying an oligoalginate transporter) 
by introducing the genes responsible for bacterial DEHU or oligoalgi- 
nate catabolism, respectively (Supplementary Fig. 1 and Supplementary 
Table 1). However, the absence of a suitable positive control makes the 
screening process more complex. Negative results do not necessarily 
mean that the tested transporters are inactive; instead, the engineered 
pathways for DEHU and oligoalginate metabolism in the screening 
strains might not possess sufficient activity to support growth. To mini- 
mise this issue, we set as a criterion during strain construction that each 
enzyme must possess more than 0.1 mol min” ' mg’ specific activity in 
crude lysate. This activity is comparable to one of the slower steps of the 
glycolytic pathway in S. cerevisiae’®. Each enzyme present in these screen- 
ing strains was selected via in vitro testing (Supplementary Tables 3-7). 

To identify an alginate transporter, an RNA-sequencing (RNA-seq)- 
based differential expression analysis of A. cruciatus grown on alginate 
versus glucose was performed using two assembly and expression ana- 
lysis programs, Trinity*' and Cufflinks”. The top 20 genes identified by 
each program are listed in Supplementary Tables 8 and 9, respectively. 
These results are similar and show that several transporters are highly 
induced when A. cruciatus is grown on alginate. Among them, the trans- 
cript comp8660 (XLOC_004822), encoding an MFS quinate transporter 
homologue, is the most highly expressed transcript. The list also con- 
tains transcripts that might be involved in uronate-containing poly- 
saccharide metabolism, such as homologues of polysaccharide lyases 
(PL), alcohol dehydrogenase (ADH), dihydroxyacetone kinase (DAK) 
and dihydrodipicolinate synthase (aldolase). Genomic DNA sequen- 
cing revealed that some of these genes were clustered in close physical 
proximity to each other (Supplementary Fig. 4), including comp8660, 
implying that comp8660 plays an important role in alginate metabo- 
lism. A complementary DNA (cDNA) on the transcript comp8660 was 
cloned into a yeast expression plasmid (pDHT1-1) that was trans- 
formed into BAL2193 and BAL2438, and the transformed strains were 
tested for growth on DEHU and oligoalginate substrates, respectively. 
BAL2193 harbouring the plasmid showed improved growth in screen- 
ing (S) media (Supplementary Table 10) containing DEHU compared 
to BAL2193 harbouring an empty vector (Fig. 1), whereas BAL2438 
harbouring the same plasmid did not grow on oligoalginate. Thus, the 
cDNA derived from comp8660 encodes a DEHU transporter, named 
Ac_DHT1 (Supplementary Sequence). 

Ina complementary approach, we created a constitutively expressed 
cDNA library isolated from A. cruciatus that was actively growing on 
alginate. The cDNA library was then transformed into both BAL2193 
and BAL2438 and tested for its ability to impart growth on DEHU and 
oligoalginate, respectively. The population was screened in triplicate 
by serially subculturing the strains in shake flasks over a span of 28 days 
to allow for the enrichment of actively growing strains. BAL2193 har- 
bouring the cDNA library showed significant outgrowth after 13 days 
following the third round of subculture (Supplementary Fig. 5a). Addi- 
tional rounds of subculture led to further increases in the final Déo9 nm 
values at much shorter time intervals (48-72 h). By contrast, we observed 
no outgrowth in BAL2438 harbouring the cDNA library, even after 
77 days and 5 rounds of subculture (Supplementary Fig. 5b). To deter- 
mine the identities of the cDNA inserts, we isolated single colonies by 
plating BAL2193 cultures at the end of the third round of subculture. 
In total, 46 out of 63 tested colonies (21 from each flask) had three- to 
sixfold higher final Deoo nm Values than BAL2193 harbouring an empty 
vector (Supplementary Fig. 6). Sequencing analysis revealed that all 46 
strains contained plasmids (pDHT1-2) with the Ac_DHT1 sequence. 
The colony with the best growth characteristics was re-grown in media 
containing DEHU (Fig. 1). When any of the genes in the DEHU meta- 
bolism pathway was removed from the host strain, the strain could not 
grow in media containing DEHU as the sole carbon source (Supplemen- 
tary Fig. 7), indicating that growth in DEHU is pathway dependent. 

Ac_DHT1 showsa high degree of homology to quinate transporters, 
which is unexpected because no clear correlation between quinate and 
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Figure 1 | Functional validation of a DEHU transporter gene (Ac_DHT1) 
isolated via complementary strategies. Ac_DHT1 was isolated through two 
complementary strategies: direct cloning following an RNA-seq-based 
differential expression analysis and functional screening of a cDNA library 
derived from A. cruciatus. Growth validation for the isolated Ac_DHT1 gene 
is shown. Blue, BAL2193 with an empty vector plasmid; orange, BAL2193 
with pDHT1-1 (cDNA identified through the RNA-seq-based analysis, 

gene expression controlled by pTDH3); green, BAL2193 with pDHT1-2 
(isolated through the cDNA library screen); and red, BAL2193 re-transformed 
with isolated pDHT1-2. The solid lines represent growth on media 
containing DEHU, and the dotted lines represent growth on media with no 
substrate. Dgoo nm Measurements were taken in a microplate reader. All data and 
error bars represent averages and standard deviations of 6-8 measurements. 


DEHU can be inferred from their structures. To better understand their 
evolutionary correlation, we built a phylogenetic tree comprising the 
following set (303 protein sequences): Ac_DHT1, Aspergillus nidulans 
quinate transporter (QutD)”, and their homologues along with the 
second most highly expressed MFS transporter in A. cruciatus (comp8944) 
(Fig. 2a). All 100 sequences obtained from a BLASTP search using 
Aspergillus nidulans QutD were classified into a single family (QutD 
family). DHT1 homologues are more highly diverged and were clas- 
sified into three major groups (Fig. 2a). To address the question of why 
quinate and DEHU transporters are evolutionarily related, we ana- 
lysed the structure of DEHU in solution using nuclear magnetic reso- 
nance (NMR) to determine whether it possesses the structure indicated 
previously using a chemical method” and routinely cited in other studies. 
The 'H-NMR, gradient correlation spectroscopy (gCOSY) and '°C- 
NMR analyses revealed that DEHU molecules are predominantly 
hydrated to form two cyclic hemiacetal stereoisomers, a behaviour that 
is similar to that of glutaraldehyde”® and another unsaturated uronic 
acid”’ (Supplementary Figs 8-10 and Supplementary Tables 11 and 12). 
These DEHU derivatives are structurally similar to quinate (Fig. 2b), 
which could explain why QutD and DHT1 are evolutionarily related. 

The functional regulation of mannitol catabolism in S. cerevisiae is 
cryptic'®. However, it is understood that the expression of YELO70W/ 
YNRO73C (an NAD*-dependent mannitol-2-dehydrogenase (M2DH)) 
along with a mannitol transporter is necessary to support S. cerevisiae 
growth on mannitol’’. To address mannitol catabolism in the strains 
used in the present study, the gene expression levels of three strain 
backgrounds (Lalvin, Pasteur Red and SEY/Dip) that grew on man- 
nitol (Supplementary Fig. 11) were analysed by microarray in three 
different growth substrates: glucose, raffinose, and mannitol. The ana- 
lysis revealed that the top three genes induced in all three strains 
encode M2DH (YEL070W/YNR073C), previously uncharacterized 
putative MFS transporters (YEL069C(HXT13)/YNRO72W(HXT17)), 
and an aldose-1-epimerase homologue (YNRO71C). These genes were 
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Figure 2 | Evolutionary correlation among quinate (QutD), Ac DHT1, and 
homologous transporters. a, A phylogenetic tree of QutD, Ac_DHTI1, and 
homologous transporters is generated. The tree indicates a strong evolutionary 
relationship between QutD and DHT1. Whereas QutD and its homologues are 
tightly clustered, DHT1 is diverged and clustered into three major groups 
(Class I, II, and II] DHT1 homologue (DTH) families). A. cruciatus DHT1 is 
found in class I. A. niger, Aspergillus niger; A. oryzae, Aspergillus oryzae; 

N. crassa, Neurospora crassa; P. teres, Pyrenophora teres. b, A putative model 


a 
6 
3 
& 
| 
5 cay 
ne) 
& 
g 
c 
E 
g 3 
a 
d 
2 
3 
= 
= 
1 ® 
ne) 
g 
€ 
_—— = 
Sd ad Y LY Y ww w& 
CS oF oF a a 
RC a Cog ONE OM © 2 02 
CE ES SF FF SF SF 
es € £ 8 v8 


Figure 3 | Engineering mannitol metabolism in S. cerevisiae. a, Preliminary 
growth analysis of the S. cerevisiae strain BAL2970 harbouring the plasmids 
pM-EV (empty vector plasmid for pM1, pM2, and pM3), pM1 (YNRO73C, 
YNRO72W, and YNRO7I1C present; only YNRO73C overexpressed), pM2 
(YNRO73C and YNRO72W present and overexpressed), pM3 (YNRO73C, 
YNRO72W, and YNRO71C overexpressed), pHXT-EV (an empty vector for 
pHXT13 and pHXT17), pHXT13 (YEL069W) or pHXT17 (YNRO72W) were 
tested. b-e, Ethanol production, mannitol consumption, and Dgoo nm for 
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showing how DTH families could be involved in uronic acid metabolism is 
shown. The 'H-NMR, its gCOZY, and '°C-NMR studies indicate that DEHU 
can be readily hydrated to form a cyclic hemiacetal structure in an aqueous 
solution, similar to the structure of quinate. Although further biochemical 
characterization of these transporters is required to validate the model, DTH 
families may be strongly diverged to allow the uptake of different stereoisomers 
of quinate-like hydrated, cyclic, hemiacetal forms of unsaturated uronates. 
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BAL2970 harbouring pM1 and pHXT13 (b), pM1 and pHXT17 (c), pM2 and 
PHXT-EV (d), and pM3 and pHXT-EV (e), are shown. Overexpression 

of YNRO72W resulted in better growth compared to overexpression of 
YEL069C (b and c), and overexpression of YNRO71C slightly improved growth 
compared a construct lacking this gene (d and e). Red, mannitol concentration; 
blue, ethanol concentration; grey, Doo nm. All data and error bars represent 
averages and standard deviations of triplicate measurements. 
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Figure 4 | Ethanol fermentation from mannitol and DEHU. Ethanol was 
fermented from a 1:2 molar ratio of DEHU:mannitol at 6.5% (w/v) (a) and 9.8% 
(w/v) (b) of total sugars using BAL3215. a, An ethanol titre of 3.3% (v/v) 
(26.2 gl"), maximum theoretical yield of 83% from consumed sugars 

(78% from fed sugars), and maximum ethanol productivity of 1.0g1>*h~? 
were achieved. The ratio of mannitol: DEHU consumption was 2.4. b, An 
ethanol titre of 4.6% (v/v) (36.2 g 1”), maximum theoretical yield of 75% from 


combinatorially overexpressed in BAL2970 (Fig. 3a and Supplementary 
Table 2), which was then assessed for ethanol production, mannitol 
consumption and growth (Fig. 3b-e). As expected, the minimal genes 
that were required for S. cerevisiae growth on mannitol encoded M2DH 
and a mannitol transporter (see Supplementary Discussion). 

To enable the co-metabolism of mannitol and alginate, all genes enco- 
ding the enzymes necessary for alginate and mannitol metabolism were 
chromosomally integrated, and the effects of the cofactor preferences 
of different DehR enzymes were evaluated using ethanol production as 
a guideline. Because S. cerevisiae M2DH generates excess NADH, it is 
anticipated that only an NADH-dependent DehR can counterbalance 
the excess reducing equivalents produced from mannitol consump- 
tion. Therefore, we chromosomally integrated genes encoding DehR, 
which preferentially use NADH and co-use NADH and NADPH (Sup- 
plementary Fig. 12), into the S. cerevisiae genome to construct BAL2759 
and BAL2956, respectively (BAL2772, harbouring an NADPH-dependent 
DehR, was used as a control). Initial DEHU-dependent growth rates 
were poor in these strains, so we performed an adaptation experiment 
to improve the efficiency of DEHU catabolism. The strains were grown 
aerobically in shake flasks in media with DEHU as the only carbon 
source. The cultures were diluted every few days to maintain exponen- 
tial growth over the course of three months. The strain doubling time 
decreased from more than 60h to less than 5h over the course of the 
experiment (Supplementary Fig. 13). A preliminary experiment suggested 
that BAL2956 showed an increased ability to produce ethanol from 
mannitol and DEHU (Supplementary Fig. 14). These results indicate 
that the ability of DehR to co-use NADH and NADPH efficiently is 
crucial (for example, the reservation of some NADH is necessary for 
basal cell growth and metabolic maintenance). 

The adapted BAL2956 strain had limited capability for anaerobic 
ethanol fermentation from DEHU and mannitol, requiring microae- 
robicity for efficient ethanol production. Therefore, we adapted and 
selected the strain for growth on DEHU and mannitol under anaerobic 
conditions. Because we were unable to observe anaerobic growth of 
BAL2956 in liquid culture, these experiments were carried out on an agar 
plate with HC media containing DEHU and mannitol. Approximately 
200 colonies appeared on the plate within 5 days after an actively grow- 
ing aerobic culture of BAL2956 was plated. Several colonies were re- 
streaked on fresh plates with the same media conditions. Single colonies 
were isolated and grown aerobically in HC media containing DEHU, and 
the culture (BAL3215) with the fastest growth rate (u > 0.12 h ‘) was 
chosen for further experiments. Ethanol fermentation experiments were 
performed in small-scale laboratory vessels with HC media containing 
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consumed sugars (71% from fed sugars), and maximum ethanol productivity of 
1.9g1 *h * were achieved. The ratio of mannitol: DEHU consumption was 
2.1. Red, mannitol concentration; green, DEHU concentration; blue, ethanol 
concentration; orange, glycerol concentration; black, acetate concentration; 
grey, dry cell weight (DCW). All data and error bars represent the averages and 
standard deviations of triplicate measurements. 


an approximately 1:2 molar ratio of DEHU:mannitol at 6.5% (w/v) 
and 9.8% (w/v) total sugars (mimicking the sugars prepared via the 
brown macroalgae biorefinery approach, Supplementary Fig. 15). In 
both cases, ethanol was efficiently produced, achieving titres of 4.6% 
(v/v) (36.2 g1~ '), comparable to the benchmark titres for commercial 
cellulosic ethanol industries”, and yields up to 83% of the maximum 
theoretical yield from consumed sugars (Fig. 4). 

In conclusion, we have successfully developed the first S. cerevisiae 
synthetic biology platform that enables the use of unique sugars in brown 
macroalgae for high-efficiency ethanol fermentation. The results of the 
present study are of great significance because S. cerevisiae is the stan- 
dard microbe for industrial-scale fuel alcohol production”’. The use of 
this platform, however, is not limited to ethanol production; it can be 
used to produce many other biofuels and renewable chemicals with 
further genetic modifications””°. Together with other synthetic biology 
platforms, we foresee that this S. cerevisiae platform technology could 
help to create a sustainable economy and society. 


METHODS SUMMARY 


Enabling S. cerevisiae to ferment brown macroalgae sugars into ethanol required 
four major engineering modifications: (1) reconstruction of a bacterial alginate 
pathway, (2) identification and integration of a DEHU transporter, (3) deregula- 
tion of a native mannitol catabolic pathway, and (4) selection of a DehR with optimal 
cofactor preference for redox-balance. (1) Several genes encoding enzymes cata- 
lysing each step involved in DEHU catabolism were overexpressed in S. cerevisiae. 
The specific activity of each enzyme in a crude lysate was assayed. The genes enco- 
ding enzymes with high specific activities (>0.1 pmol min” ' mg~') were chro- 
mosomally integrated into the DEHU transporter screening strain (BAL2193) 
and ethanol production strains (BAL2772, BAL2759, BAL2956 and BAL3215). 
(2) To isolate a gene encoding the DEHU transporter, two complementally methods 
were used: mRNA-seq-based analysis and cDNA library based on mRNA isolated 
from A. cruciatus actively growing on alginate. The mRNA-seq-based analysis was 
carried out using Trinity*' and Cufflinks”. The transporter activity was assessed 
by determining whether overexpression of cDNAs derived from both methodo- 
logies enabled BAL2193 to grow on DEHU as the sole sugar substrate. (3) Differen- 
tial gene expression analyses of three S. cerevisiae strains grown in glucose, raffinose 
and mannitol were performed using microarrays. The S. cerevisiae native MZ2DHs 
and putative mannitol transporters were overexpressed, and the gene combination 
supporting optimum growth of S. cerevisiae on mannitol was identified. (4) The 
genes involved in alginate and mannitol catabolism pathways were chromosomally 
integrated into a single S. cerevisiae host strain. The cofactor preference of DehR 
was investigated based on the ability to support ethanol production from brown 
macroalgae sugars (BAL2772, BAL2759, and BAL2956). Sequential passage 
of BAL2956 in DEHU media under aerobic conditions and in mannitol and 
DEHU media under anaerobic conditions allowed the strain (BAL3215) to adapt 
to enable efficient ethanol fermentation from brown macroalgae sugars. 
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METHODS 


All reagents were purchased from Sigma-Aldrich unless stated otherwise. Asteromyces 
cruciatus ATCC 26324 was purchased from ATCC. S. cerevisiae strains were 
derived from the laboratory strain SEY6210/6211 (ATCC). Standard molecular 
biology techniques were used to engineer all S. cerevisiae strains. The genotypes of 
all S. cerevisiae screening and fermentation strains and the compositions of all 
plasmids that were used in this manuscript are listed in Supplementary Tables 1 
and 2, respectively. 

Characterization of alginate degradation by A. cruciatus (in culture supernatant). 
A strain of A. cruciatus was inoculated into 50 mM phosphate buffer at neutral pH 
supplemented with 10 mM nitrate, 2% sodium alginate, 5 ug ml~ z ampicillin, and 
20 pg ml? kanamycin at a final volume of 200 ml in a shake flask. The culture was 
grown for 3 days in an orbital shaker at 200 r.p.m. at 25 °C. The culture supernatant 
was collected, filter-sterilized, and mixed with fresh 4% alginate solution. Ten days 
after the reaction was initiated, samples were filtered for high-performance liquid 
chromatography-mass spectrometry (HPLC-MS) analysis. The analysis was car- 
ried out using an HPLC Nexera system (Shimadzu) equipped with a Hypercarb 
(100 X 3mm) column (Thermo Fisher Scientific) with a flow rate of 0.65 ml min’! 
and a temperature of 65 °C. A gradient of 0.2% trifluoroacetic acid in water against 
methanol from 30% to 90% over 15 min was programmed. Oligoalginate peaks 
were detected using an MS (Shimadzu) with selected ion monitoring detector. 
Characterization of oligoalginate degradation by A. cruciatus (growth- 
coupled degradation). A strain of A. cruciatus was inoculated into YNB media 
supplemented with amino acids, 2% sodium alginate degraded by alginate lyase, 
25 ug ml’ ampicillin and 25 pg ml ' kanamycin at a final volume of 25 ml in a 
shake flask. The culture was grown for 7 days in an orbital shaker at 200 r.p.m. at 
25°C. The culture supernatant was filtered for HPLC-ultraviolet analysis. The 
analysis was carried out using an HPLC Nexera system (Shimadzu) equipped with 
a TSKgel-amide 80 (2 mm X 15 cm) 5-j1m Tosoh column (TOSHO) with a flow rate 
of 0.5mlmin™’ and a temperature at 50°C. A gradient of 25mM ammonium 
formate in water against acetonitrile from 95% to 70% over 5 min, from 70% to 
50% over 7 min, and 50% held constant for 2 min was programmed. Alginate 
oligomers were detected using an ultraviolet detector (Shimadzu) at 235 nm. 
Construction of S. cerevisiae tester strains to test DEHU metabolism enzyme 
activities. Genes encoding DEHU reductase (DehR), 2-keto-3-deoxy-D-gluconate 
(KDG) kinase (KdgK), KDG-6-phosphate (KDGP) aldolase (KdgpA) and oligoal- 
ginate lyase from several organisms were codon optimised and/or synthesized 
(DNA2.0 Inc.). The synthesized genes were cloned into plasmids downstream of 
strong, constitutive S. cerevisiae promoters (pTEF2 or pFBA1). DNA constructs con- 
taining these sequences for genomic integration were generated by high-fidelity 
PCR amplification using Phusion DNA polymerase (New England Biolabs) and 
integrated into the genome of the wild-type laboratory strain BAL2130 using a 
standard polyethylene glycol, lithium acetate, Tris-EDTA (PEG/LiAC/TE) yeast 
transformation protocol. Genomic integrations were confirmed by diagnostic 
PCR. The activity of each introduced enzyme in a BAL2130 crude lysate was mea- 
sured with the appropriate enzyme activity assay described below. The enzyme 
activity results are listed in Supplementary Tables 3-7. 

Assay for oligoalginate lyase activity. A spectrophotometric coupled enzyme assay 
for oligoalginate lyase was developed. The reaction mixture contained 50 mM HEPES 
(pH 7.4), 100 mM NaCl, 50mM KCl, 5mM MgCh, 0.005% (w/v) Triton X-100, 
0.5% (w/v) oligoalginate, 4 mM NADPH, and purified DehR from Agrobacterium 
tumefaciens C58 in a final volume of 50 jl. Oligoalginate was prepared by incuba- 
ting 0.10 mg ml alginate lyase with 50 gl‘ sodium alginate in the presence of 
2mM dithiothreitol (DTT) for over 10h at 37 °C, and this preparation was used 
directly in the assay. Excess partially purified DehR (sufficient to give a rate of = 0.2 
A34onm min | with 10mM DEHU and 2mM NADPH) was present along with 
2mM NADPH. The reaction was initiated by adding 5 il of S. cerevisiae crude lysate. 
Assay for DehR activity. A spectrophotometric assay to identify DehR activity in 
S. cerevisiae lysates was developed. The assay directly measures the oxidation of 
NADPH when DEHU is reduced to KDG. The reaction mixture contained 50 mM 
HEPES (pH 7.4), 100 mM NaCl, 50 mM KCl, 5mM MgCh, 0.005% (w/v) Triton 
X-100, 10 mM DEHU, and 4mM NADPH ina final volume of 50 pl. The reaction 
was initiated by adding 5 ul of S. cerevisiae crude lysate. 

Assay for KdgK activity. A spectrophotometric assay to identify KdgK activity in 
S. cerevisiae lysates was developed. KdgK activity is measured by NADH con- 
sumption coupled with KdgpA and lactate dehydrogenase activity. First, phos- 
phorylated KDG is cleaved into pyruvate and glyceraldehyde-3-phosphate by 
KdgpA. Lactate dehydrogenase reduces pyruvate into lactate while simultaneously 
oxidising NADH. The reaction mixture contained 50mM HEPES (pH7.4), 
100 mM NaCl, 50 mM KCl, 5mM MgCh, 2 1M DTT, 0.005% (w/v) Triton X-100, 
5mM KDG, 10 mM ATP, 4mM NADH, 2 jig ml ' partially purified KdgpA from 
E. coli, and 50 ug ml * lactate dehydrogenase in a final volume of 50 pl. In this 


assay, KDG was prepared via the reaction of mannonate lactone (Carbsynth) and 
a purified mannonate dehydratase from E. coli (prepared in-house). 

Assay for KdgpA activity. A spectrophotometric assay to identify KdgpA activity 
in S. cerevisiae lysates was developed. The reaction mixture contained 50 mM 
HEPES (pH 7.4), 100 mM NaCl, 50mM KCl, 5mM MgCh, 2 uM DTT, 0.005% 
(w/v) Triton X-100, 5mM KDGP, 4mM NADH, and 50 pg ml ! lactate dehy- 
drogenase in a final volume of 50 ul. KDGP was synthesized by the reaction of 
KDG and ATP with KdgK. 

Construction of DEHU and oligoalginate transporter screening strains. Genes 
encoding oligoalginate lyase from Agrobacterium tumefaciens C58 or Oceanicola 
granulosus HTCC2516, DehR from Sphingomonas sp. Al, KdgK from Saccharophagus 
degradans 2-40, and KdgpA from Vibrio splendidus 12B01 were selected based on 
the results of enzyme activity assays (with the exception of the oligoalginate lyase 
from Agrobacterium tumefaciens C58, which was integrated before enzyme scree- 
ning; Supplementary Tables 3-6). The sequences were cloned downstream of 
constitutive S. cerevisiae promoters (pTEF2, pFBA1 or pTDH3) into plasmids. 
DNA constructs containing these sequences for genomic integration were gene- 
rated by high-fidelity PCR amplification and were integrated into the S. cerevisiae 
genome using a standard PEG/LiAC/TE yeast transformation protocol to create 
the host strains BAL2193 and BAL2438. Genomic integrations were confirmed 
by diagnostic PCR. The genotypes of these strains are listed in Supplementary 
Table 1. The activity of each introduced enzyme in crude lysates was measured 
with the appropriate enzyme activity assay. 

Identification of DEHU transporter (Ac_DHT1) through RNA-seq-based 
analysis. A. cruciatus was grown aerobically at 30°C in YP media with either 4% 
sodium alginate or 2% glucose as the primary carbon source. Cells were collected 
from exponentially growing cultures, and DNA and total RNA were isolated. The 
genomic DNA was sequenced by Eureka Genomics using 51-cycle paired-end 
reads on an Illumina Genome Analyzer II instrument (Illumina). These sequence 
reads were assembled using the Velvet software package”' to generate a library of 
putative genomic DNA fragments with a minimum scaffold size of 200 bases. The 
RNA-seq analysis was carried out by Prognosys Biosciences using 76-cycle single- 
end reads on an Illumina Genome Analyzer 5 instrument (Illumina). The diffe- 
rential expression of mRNAs in A. cruciatus grown on alginate versus glucose 
was analysed using the Trinity” and Cufflinks*”” program packages. A cDNA on 
the transcript comp8660 was amplified by PCR using total RNA isolated from 
A. cruciatus as a template. The amplified DNA fragment was cloned into an 
S. cerevisiae CEN/ARS-based expression vector under the constitutive S. cerevisiae 
TDH3 promoter. 

DEHU-dependent growth screening in deep-well plates. DEHU was prepared 
from a 5% sodium alginate dissolved in water that was treated with 0.025 mg ml 
alginate lyase and 0.2 mg ml ' oligoalginate lyase (prepared in house)'. This mix- 
ture was subjected to high-speed centrifugation followed by filtration through a 
0.22-um sterile filter (When needed, this DEHU solution can be concentrated 
using a conventional rotary evaporator). To minimise background growth, yeast 
minimal media consisting of YNB supplemented with CSM lacking uracil (Sunrise 
Science Products) was used at a concentration recommended by the manufacturer 
(Screening (S) media in Supplementary Table 10). DEHU was added to each experi- 
mental well at a concentration of 8.8 gl’. Growth assays were carried out in 96- 
deep-well plates at 29 °C with 90% humidity and shaking at 950 r.p.m. Growth 
was assessed by measuring the Dgoonm of the culture using an Eon Microplate 
Spectrophotometer (BioTek). 

Identification of Ac_DHT1 using an A. cruciatus cDNA library. Total RNA 
was isolated from 1] cultures of A. cruciatus that were actively growing on alginate 
using the RNeasy Plant Mini Kit (Qiagen). The mRNA was purified with the 
Dynabead mRNA purification kit (Invitrogen) and used as an input for cDNA 
library construction with the Superscript Full-Length cDNA Library Construction 
Kit (Invitrogen). cDNA inserts were cloned into a Gateway cloning-compatible 
yeast expression vector under the control of the S. cerevisiae TDH3 promoter. The 
plasmid contained a blasticidin resistance gene for plasmid selection. 

The cDNA library was introduced into the host strains BAL2193 and BAL2438. 
These strains were grown at 30 °C in S media (Supplementary Table 10) contain- 
ing 13.2g1_' DEHU or 25 g1 * oligoalginate, respectively. Cultures were periodi- 
cally set back to an Dgoo nm Of 0.1 (indicated by the dashed lines in Supplementary 
Fig. 5 for BAL2193). After the third round of subculture, BAL2193 showed signi- 
ficant outgrowth, and cells from this culture were plated onto YPD plates supple- 
mented with blasticidin to isolate single colonies. A total of 21 colonies from each 
flask were tested for growth on S media with DEHU and S media with no substrate 
in a deep-well plate format (600 11) to isolate the strain harbouring Ac_DHT1. The 
positive clones that were identified via secondary screening were then sequenced to 
identify the cDNA insert. 

Construction of strains to study the pathway dependence of DEHU consump- 
tion. Strains derived from BAL2193 to test pathway dependence during growth in 
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DEHU media were generated by sporulating BAL2193 and isolating the haploid 
strain BAL2267. BAL2267 was mated to the wild-type haploid strain BAL2130, 
and the resulting heterozygous diploid was sporulated. Following sporulation, 
haploids that were missing each individual DEHU consumption pathway gene 
were isolated (BAL2295, BAL2296, BAL2297 and BAL2298). The genotypes of all 
pathway dependence strains are listed in Supplementary Table 1. 
Construction of a phylogenetic tree for Ac_DHT1. A BLASTP search against 
the non-redundant protein sequence database was carried out using the protein 
sequences Ac_DHT1,an Ac_DHT1 homologue (a cDNA on the transcript comp2060), 
Aspergillus niger An14g04280, and Aspergillus nidulans QutD as queries. For each 
sequence, the protein sequences for the top 100 BLAST hits were retrieved. These 
protein sequences were pooled (400 sequences). The putative protein sequence for 
an MFS transporter (comp8944 and XLOC_008636), which is highly expressed in 
A. cruciatus grown in media containing alginate versus glucose, was used as an anony- 
mous root. A multiple sequence alignment (MSA) for these sequences was gene- 
rated using MUSCLE (http://www.ebi.ac.uk/Tools/msa/muscle/). All redundant 
sequences were eliminated from the MSA, and a distance matrix for the remaining 
289 aligned protein sequences was generated based on neighbour-joining algorithms. 
The phylogenetic tree was visualized using Geneious Pro software (Geneious). 
This first tree identified that Ac_DHT1, its homologue from A. cruciatus, and 
An14g04280 from Aspergillus niger are classified into two different subfamilies 
(class I and II DHT1 homologue families) and suggested the presence of an addi- 
tional subfamily of DHT1 homologues that is more strongly related to the QutD 
family. To refine the tree, anew MSA was rebuilt based on a pool that included 100 
protein sequences that were identified through a BLASTP search using gil 89198523 
(a representative protein in this family) as a query in addition to the 289 protein 
sequences in the original set. All redundant sequences were eliminated from the 
MSA, and a new tree was constructed using the remaining 303 aligned protein 
sequences. 
Sample preparation of DEHU for NMR analysis. DEHU was prepared from a 
5% sodium alginate solution dissolved in water that was treated with 0.025 mg 
ml ' alginate lyase and 0.2 mg ml ' oligoalginate lyase’. The mixture was then 
subjected to high-speed centrifugation, followed by filtration through a 0.22-um 
sterile filter. This DEHU solution was concentrated on a rotary evaporator that 
maintained a bath temperature below 40 °C, followed by drying under high vacuum 
conditions (<100 mTorr) for 15-20h, resulting in a yellow crystalline solid. 
Phosphate-buffered (approx. 0.2M) NMR samples of DEHU (approx. 0.1 M) 
were prepared by dissolving dry DEHU in D0 with either t-butanol (0.1 M) or 
methanol (0.1 M) as an internal standard. Phosphate was used to buffer the sample 
to pH7.0. 
Standard 'H- and '*C-NMR spectra. NMR spectra were measured at ambient 
temperature ona Varian UNITYplus 500 spectrometer equipped with a 5 mm pulsed- 
field-gradient (PFG) switchable broadband probe and operating at 499.74 MHz (‘H) 
and 125.67 MHz ('°C). One-dimensional (1D) 'H-and ‘*C-NMR spectra were 
acquired under standard conditions and referenced to the solvent residual peak. 
‘H-NMR spectra were obtained using a spectral width of 4,529 Hz over 64,000 
data points and a relaxation delay of 2s. '*C-NMR spectra were obtained using a 
relaxation delay of 3 s and spectral width of 26,595.74 Hz over 32,768 data points 
and multiplied by an exponential function corresponding to a 0.50 Hz line broad- 
ening before Fourier transformation. 
2D ‘H-NMR spectra. Two-dimensional (2D) 'H-NMR spectra were obtained 
using gradient pulses on a 5 mm PFG switchable broadband probe without sample 
spinning. Phase-sensitive spectra were acquired using the hyper-complex States 
method. Threefold linear prediction was applied to the F1 dimension as imple- 
mented by standard Varian software. 
gCOSY. Absolute value gCOSY spectra were obtained using a spectral width of 
4,529 Hz in both dimensions. The pre-acquisition delay was set to 1.0s, and 400 
increments with 4 transients of 1,024 data points were acquired. Both F2 and Fl 
were multiplied by unshifted sine-bell weighting functions that were matched to 
the acquisition or evolution time. Prior to 2D Fourier transformation, F2 and F1 
were zero filled to 2,048 and 2,048 data points, respectively. 
Construction of an S. cerevisiae host strain capable of mannitol metabolism. 
AnS. cerevisiae genomic region containing YNRO73C (mannitol-2-dehydrogenase 
(M2DH)), YNRO72W (HXT17), and YNRO7IC (gene encoding a putative aldose- 
1-epimerase) was amplified by high-fidelity PCR and cloned into a yeast CEN/ARS 
plasmid downstream of the S. cerevisiae TDH3 promoter (pTDH3 driving the expres- 
sion of YNRO73C only; the other two ORFs retained their native promoter regions). 
This plasmid was named pM1. A second plasmid, containing YNRO73C and 
YNRO72W under the control of the S. cerevisiae TDH3 and TEF2 promoters, 
respectively, was also generated (pM2). To increase the expression levels of all 
three genes in this genomic region, the DNA composing the S. cerevisiae FBAI and 
TEF2 promoters were subcloned into pM1 to drive the expression of YNRO71C 
and YNRO72W, respectively. This plasmid was named pM3. These plasmids, along 
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with plasmids containing only YEL069C (pHXT13) or YNRO72W (pHXT17) under 
the control of the S. cerevisiae TDH3 promoter, were transformed into the strain 
BAL2970, and the transformed strains were tested for the ability to grow in HC 
media containing 7.5% mannitol as the sole carbon source. 

Construction of S. cerevisiae host strains with integrated mannitol and DEHU 
metabolism pathways. Genes encoding DehR from different sources (Vibrio 
splendidus 12B01, Agrobacterium tumefaciens C58 or Vibrio harveyi), KdgK from 
E. coli, KdgpA from Vibrio splendidus 12B01, and Ac_DHT1 were codon-optimised 
for S. cerevisiae expression (with the exception of Agrobacterium tumefaciens DehR) 
and synthesized (DNA2.0 Inc.). The synthesized sequences were cloned into plas- 
mids downstream of strong, constitutive S. cerevisiae promoters (pFBA1 and 
pTDH3). These sequences, along with the genes and associated promoters from 
pM3, were subcloned into plasmids that contained wild-type biosynthetic genes 
and their associated 5’ and 3’ untranslated regions (UTRs). DNA constructs for 
genomic integration that contained these genes were generated by restriction 
enzyme digestion of plasmids and then integrated into the S. cerevisiae genome 
at the appropriate auxotrophic loci to create the prototrophic host strains BAL2759 
(containing V. splendidus DehR), BAL2772 (containing Agrobacterium tumefa- 
ciens DehR), and BAL2956 (containing V. harveyi DehR). The genotypes of these 
strains are listed in Supplementary Table 1. 

Media and growth conditions. All adaptation and ethanol production experi- 
ments were carried out using a defined synthetic medium that was supplemented 
with a mixture of amino acids as a nitrogen source. In general, experiments in which 
the media contained less than 4% carbon were carried out using a low-carbon (LC) 
medium (Supplementary Table 10). Experiments with media containing greater 
than 4% carbon were conducted using the high carbon (HC) medium (Supplemen- 
tary Table 10). The pH levels of all media were adjusted to 5.5 with 13 N NaOH 
before filter sterilization. The pH levels of all shake flask and screw-cap bottle 
cultures were maintained throughout the experiments by supplementation with 
20-50 mM MES buffer. All fermentation experiments were supplemented with 
Tween 80 (0.42 gl’) and ergosterol (10 gl’). 

Adaptation of engineered S. cerevisiae strains under aerobic conditions. 
BAL2759, BAL2772 and BAL2956 strains with high specific growth rates in DEHU 
media were selected by serial transfer in 125 ml unbaffled shake flasks. Cultures 
were inoculated in 25 ml LC media with 8.8 gl!’ DEHUas the sole carbon source. 
Growth was monitored by Dgoonm throughout the experiment. The cells were 
diluted into fresh media before the culture density reached 50% of the maximum 
cell mass supported by the medium. During serial transfers, the cultures were 
diluted at least 100-fold, and the process was repeated as needed for a period of 
4-6 months. The DEHU concentration was increased gradually to 17.6 gl! in all 
cultures and increased again to 35 g1* in the case of BAL2956. All serial transfer 
flasks were grown on a rotary shaker and maintained at 29 °C with constant shak- 
ing and controlled humidity (250 r.p.m. and 90% humidity). 

Microaerobic ethanol production from DEHU and mannitol. Yeast cultures 
were grown in HC medium containing 3% DEHU and 6% mannitol to prepare 
inoculums for ethanol production experiments. The inoculums for these starter 
cultures were taken from exponentially growing mid-log cultures from the adapta- 
tion experiments after at least 150 generations of selective pressure for DEHU- 
dependent growth (Supplementary Fig. 13). Cells were acclimated through two 
rounds of culturing in the aforementioned media. The second-round starter cul- 
tures were grown to mid-log phase and collected by centrifugation. These cell 
pellets were used to inoculate the ethanol production experiments. 

Ethanol production was first carried out in 125 ml screw-cap bottles with sili- 

cone septa, magnetic stir bars, and working volumes of 30 ml each. The cells were 
co-fed with 3.4% DEHU and 4.4% mannitol in HC medium to select strains for 
follow-up experiments. Upon inoculation, cultures were placed in a water bath at 
29°C on top of a magnetic stir plate set to 500 r.p.m. Samples were taken periodi- 
cally via syringe sampling through the septa for HPLC analysis, DEHU quantifica- 
tion, and D¢oo nm Measurements. 
Selection of a BAL2956 strain variant that is capable of growth under anae- 
robic conditions. The adapted BAL2956 strain with a high specific growth rate in 
HC media containing DEHU as the substrate was unable to produce ethanol 
efficiently without maintaining microaerobic conditions. Therefore, cells that were 
capable of growing on agar plates with HC media containing 2% DEHU and 4% 
mannitol under anaerobic conditions were selected. A total of 50 pl of BAL2956 
aerobic culture was plated onto several agar plates. The plates were enclosed in a 
tightly sealed container with AnaeroPack (Mitsubishi Gas Chemical) and incu- 
bated at 30 °C for several days. Within the first 5 days, approximately 200 colonies 
appeared on each plate. To isolate single colonies, several of these colonies were re- 
streaked onto agar plates containing fresh HC media supplemented with 2% DEHU 
and 4% mannitol. These colonies were grown on HC media supplemented with 
3% DEHU aerobically, and the colony that showed the fastest growth rate in this 
medium was chosen for further experiments (BAL3215). 
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Ethanol fermentation from DEHU and mannitol. Seed yeast cultures of the 
BAL3215 were aerobically grown in HC medium containing approximately 1:2 
molar ratio of DEHU:mannitol at 6.5% (w/v) and 2% DEHU to prepare inoculums 
for ethanol production experiments. Ethanol fermentation experiments were carried 
out in 125 ml screw-cap bottles with silicone septa, magnetic stir bars and working 
volumes of 55 ml each (the head space was flushed with nitrogen gas). The cells were 
co-fed with an approximately 1:2 molar ratio of DEHU and mannitol at 6.5% (w/v) 
and 9.8% (w/v) of total sugars. Upon inoculation (the culture grown in DEHU/ 
mannitol media was inoculated into 6.5% (w/v) fermentation media, while the 
culture grown in DEHU media was inoculated into 9.8% (w/v) fermentation media), 
the cultures were placed in a water bath at 29 °C on top of a magnetic stir plate set 
to 350 r.p.m. Samples were taken periodically via syringe sampling through the 
septa for HPLC analysis and dry cell weight quantification. 

Quantification of mannitol, DEHU, ethanol, glycerol and acetate in culture 
media. The metabolite analyses were carried out using the HPLC Prominence system 


(Shimadzu) equipped with a Rezex ROA organic acid H+ 300 X 7.8mm Pheno- 
menex column and run with a flow rate of 0.6 ml min” ' and a temperature of 60 °C. 
The method is isocratic with the mobile phase of 5 mM sulphuric acid and was run 
over 30 min. Analytes were detected using a refractive index detector (Shimadzu). 
Quantification of DEHU. The DEHU standard curve for HPLC analysis was 
generated using At_DehR (prepared in house). The reaction mixture contained 
50mM HEPES at pH 7.4, 100 mM NaCl, 50 mM KCl, 5 mM MgCh, 0.005% (w/v) 
Triton X-100, 4mM NADPH, and 0.132 pg ul DehR in a 50 pl total assay 
volume. DEHU was quantified based on the consumption of NADPH, measured 
by monitoring the A at 340 nm using a plate reader. 
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The determination of protein crystal structures is hampered by the 
need for macroscopic crystals. X-ray free-electron lasers (FELs) 
provide extremely intense pulses of femtosecond duration, which 
allow data collection from nanometre- to micrometre-sized crystals'* 
in a ‘diffraction-before-destruction’ approach. So far, all protein struc- 
ture determinations carried out using FELs have been based on pre- 
vious knowledge of related, known structures’°. Here we show that 
X-ray FEL data can be used for de novo protein structure determi- 
nation, that is, without previous knowledge about the structure. Using 
the emerging technique of serial femtosecond crystallography’ **, 
we performed single-wavelength anomalous scattering measure- 
ments on microcrystals of the well-established model system lyso- 
zyme, in complex with a lanthanide compound. Using Monte-Carlo 
integration®’, we obtained high-quality diffraction intensities from 
which experimental phases could be determined, resulting in an 
experimental electron density map good enough for automated buil- 
ding of the protein structure. This demonstrates the feasibility of 
determining novel protein structures using FELs. We anticipate that 
serial femtosecond crystallography will become an important tool 
for the structure determination of proteins that are difficult to crys- 
tallize, such as membrane proteins’”*. 

Detailed knowledge of protein structures provides essential insight 
into their function at the molecular level, helping to elucidate basic 
biological processes and guiding the design of new drugs for medical 
applications. The vast majority of protein structures are determined by 
X-ray crystallography. Since its beginning almost 100 years ago, pro- 
gress in crystallography has closely reflected advances in X-ray sources 
and other instrumentation, allowing high-throughput approaches and 
the analysis of ever smaller crystals. The latter is important because 
many macromolecules, particularly membrane proteins, are difficult 
to crystallize, often yielding only very small crystals. The analysis of 
these, however, is complicated by radiation damage. 

This problem is now being solved through the use of X-ray free- 
electron lasers (FELs). These provide femtosecond, ultrabright X-ray 
pulses which are so brief that useful diffraction data can be collected 
before the sample is destroyed by radiation damage. Since the recent 
initial demonstration of structural-biological studies using X-ray FELs‘~, 
these devices have already expanded the possibilities of protein crys- 
tallography, as shown most recently by X-ray data collection of the 
undamaged electronic structure of the highly radiation-sensitive MnyCaOs; 
cluster of photosystem II at room temperature®. Although FEL data 
collection is fundamentally different from conventional crystallography 
and data processing techniques are still being developed, it is already 
evident that high-resolution data can be collected from micron-sized 
crystals at ambient temperature. For example, FEL data on lysozyme as 
a model system agree well with low-dose synchrotron data despite a 
dose of 30 MGy per crystal’, the typical dose limit for data collection at 
cryogenic temperature’. The FEL analysis of micron-sized crystals of 
the trypanosomal protease cathepsin B revealed new high-resolution 
structural features that showed how the native protease is inhibited*. 


These findings, and the demonstration that X-ray diffraction data from 
nanometre- to micrometre-sized crystals of large membrane protein 
complexes’** can be collected, illustrate the potential of FEL-based 
protein crystallography as a new tool for the analysis of the large group 
of proteins that are difficult to crystallize. 

Crystallographic structure determination requires the retrieval of infor- 
mation about the phases of the diffracted radiation which is lost during 
the measurement of intensities. So far, the data for all FEL-determined 
protein structures have been phased by molecular replacement’~ using 
phases from known, related structures. However, to establish FEL-based 
crystallography as a true stand-alone tool for macromolecular struc- 
ture determination, it is essential to show that the data can be phased 
de novo. FEL-specific phasing approaches have been proposed, such 
as oversampling-based methods”, as well as a variation on phasing 
induced by radiation damage’””” using the extremely high intensity 
of FEL pulses to change the anomalous scattering behaviour of spe- 
cific atoms’*. However, there is no apparent reason why conventional 
phasing methods cannot be used: these include the use of heavy atoms 
in multiple or single isomorphous replacement approaches, with or 
without multi- or single-wavelength anomalous diffraction (MAD or 
SAD) measurements which exploit element-specific scattering at X-ray 
absorption edges. 

Protein crystallography at X-ray FEL sources typically uses the emer- 
ging method of serial femtosecond crystallography (SFX). Given the 
extremely high intensity pulses generated by FELs, every X-ray expo- 
sure results in the complete destruction of the sample. Hence every 
diffraction image requires a new crystal, and these are introduced into 
the beam in a thin column of liquid'*”*. Moreover, during the femto- 
second-timescale exposure, no rotation of the crystal can be performed 
as in conventional crystallography, so that only still images containing 
partially recorded reflections are obtained. Variations in crystal size 
and quality and the lack of control over the crystal orientations com- 
plicate SFX data processing. Furthermore, the spectrum, intensity and 
beam profile of an FEL source typically vary significantly from shot to 
shot, further complicating the analysis. In SFX this is solved by aver- 
aging over a large number of exposures (Monte-Carlo integration), 
which then yields fully integrated diffraction intensities®””*. 

Given an adequate number of exposures, SFX-derived data are suf- 
ficiently accurate to reveal even small features such as the differences 
in electron density between different side chains’ and the very weak 
anomalous diffraction from endogenous sulphur atoms in a protein”. 
However, these experiments relied on molecular replacement to obtain 
phases. De novo phasing from a heavy atom derivative, on the other 
hand, requires an accuracy well beyond that needed to find the heavy 
atom(s) in phased difference Fourier maps. 

To test whether sufficient accuracy can be attained for de novo 
phasing by SEX, we collected a highly redundant, 2.1 A resolution SFX 
data set for a lysozyme heavy atom derivative that gives a strong anom- 
alous signal from two gadolinium atoms per asymmetric unit’’. Lysozyme 
microcrystals were soaked in gadoteridol, an organic gadolinium complex, 
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and injected in their soaking solution into the vacuum chamber of the 
coherent X-ray imaging instrument (CXI)"’ of the LCLS at the SLAC 
National Accelerator Laboratory using a liquid microjet’* essentially as 
described previously’. X-ray pulses of nominally 50 fs duration and 
8.5 keV photon energy of 2.6 mJ average power were used to collect 
2,402,199 diffraction patterns. Of these 191,060 crystal diffraction pat- 
terns (8%) were identified using CASS”, 59,667 of which were indexed 
(31%) and integrated using CrystFEL’. 

As expected from the quality of the anomalous difference Patterson 
map calculated using ~60,000 diffraction patterns (Fig. 1), heavy atom 
binding sites could be determined using automated methods (SHELXD”') 
and refined using SHARP”. The resulting refined sites were used for 
likelihood-based SAD phasing using PHASER for Experimental Phasing” 
followed by density modification using DM™. Subsequently, automatic 
building by ARP/wARP” resulted in correct main-chain tracing for 
over 60% of the structure. Iterative cycles of rebuilding and refinement 
resulted in a model at 2.1 A resolution with good statistics (see Extended 
Data Table 1). The quality of the phases at the various stages of the pha- 
sing process is shown in Fig. 2, a sequence of electron density maps from 
the various stages of the phasing process is shown in Fig. 3. Figure 3 also 
indicates CCyyap at each stage, that is, the correlation of the electron 
density map with the final electron density phased by the refined model. 

As expected for SFX data®”"®, the data quality depends strongly on 
the number of integrated patterns (Fig. 4). In SFX, an internal quality 
measure called Rgpit is used’, which is calculated by splitting the set of 
diffraction images into two halves, normally even- and odd-numbered 
images, integrating each half to obtain two sets of intensities and calcu- 
lating the R-factor 
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between them. Rgpit is closely related to Rmergea-r® used in conven- 


tional crystallography, apart from the factor J2 in the denominator, 
which is introduced to account for the reduction in multiplicity caused 


Figure 1 | w= 0.5 section of the origin-removed, super-sharpened 
anomalous difference Patterson map of the SFX lysozyme gadolinium data, 
using ~60,000 images. Clear peaks are observed from the anomalous 
scattering of the gadolinium atoms. This figure was prepared using XPREP. 
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Figure 2 | Quality of the phases at the various stages of the phasing process. 
As another, resolution-dependent measure of the progress of the phasing 
process through its various stages, we calculated the average cosine of the 
difference between the phases at a certain stage and the final, refined phases, in 
analogy to the classical FOM measure. The quality of the phases as measured by 
(cos|@ —gnail) after SAD phasing improved markedly after solvent flattening 
over the whole resolution range, after which wARP provided further 
improvement at intermediate resolution. 


by splitting the data set in two. Rmergea-p in turn, can be cast in an 
analytical form known as Rpim.”; the precision-indicating merging 
R-factor. All these R-factors account for the increase in data quality 
(signal-to-noise ratio) afforded by high measurement multiplicities. 
Importantly, it has been suggested that R, jm, can be used to predict 
whether the data are of sufficient quality for successful structure solu- 
tion by comparing it to Rano, an R-factor based on anomalous intensity 
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alous signal strength Rano is compromised by measurement errors”, 


differences . Whereas as a measure of anom- 


the ratio Rano Riad is a useful predictor of the feasibility of substruc- 


ture solution and phasing”. 

To see whether R,,);, has a similar predictive value, we calculated 
Rano/ Repiitf2 data sets calculated using 60,000, 30,000, 15,000 and 
7,500 images and found it to be 1.8, 1.4, 1.2 and 1.1, respectively. 
When all ~60,000 patterns (red line, Fig. 4) were used, excellent values 
of Rept Were obtained, as well as a high anomalous correlation coef- 
ficient CC,,,. of 0.48. Using a lower number of images (green, blue and 
purple lines, Fig. 4) resulted in significantly increased Rsp1t, but more 
importantly in a marked decrease in the anomalous correlation coef- 
ficient CCano. 

In the anomalous difference Patterson map (Fig. 1), too, a strong 
dependence of peak height on the number of integrated patterns was 
observed. The Patterson peak at (u,v,w) = (0.38, 0.30, 0.50) provides a 
good example, its height being 6.7, 5.7, 4.0 and 3.1 o using 60,000, 
30,000, 15,000 and 7,500 patterns, respectively. Using SHELXD”’, cor- 
rect heavy atom sites could be found in all these cases, but only with 
60,000 images did phasing result in an interpretable map. Thus, in this 
particular case at least, there appears to be a correlation between suc- 
cessful phasing and Rano Ropiit Extended Data Fig. 1 shows Rano and 


Rpt a8 a function of resolution. 

Lanthanide LIII edges give a very strong anomalous signal, with 
large values of f’’ close to the CuK, wavelength, making lanthanide 
derivatives attractive for both in-house and synchrotron measurements”. 
We were therefore surprised by the relatively low CC,,,, and Rano Rept 
of the SFX data and collected a low-dose, room-temperature data set 
from a large crystal mounted in a capillary using a rotating anode 


9 JANUARY 2014 | VOL 505 | NATURE | 245 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


PHASER 
CC nap = 0.46 


Figure 3 | Progression of the phasing process. Electron density maps are 
shown. a, SAD phasing with PHASER. b, Solvent flattening with DM. 
c, Automatic building using wARP. d, Final map after refinement. The 


source. As expected, the anomalous signal was very strong (CCano = 0.92, 
Rano/, Ry im? See Extended Data Table 1 and Extended Data Fig. 1) and 
phasing ‘was possible using a fully automated approach. Although 
these data are not completely isomorphous with our SFX data no large 
systematic differences are apparent. The lower CC,,,, of the SFX data is 
therefore probably owing to the fact that the SFX data have not yet 
fully converged, because CC,n. and Rano Repiit 2£€ still increasing with 


the number of patterns used (see Fig. 4 and above). This is, however, 
not a fundamental hindrance, because the number of collected diffr- 
action images was limited by the available beam time. The anticipated 
higher repetition rates of both upgraded and new FEL facilities will 
greatly speed up the acquisition of high-multiplicity data sets. In addi- 
tion, the use of seeded FEL pulses” with their expected higher repro- 
ducibility and ‘cleaner’ spectrum may result in faster convergence of 
the merged reflection intensities. Moreover, we expect that improve- 
ments in data processing, such as the inclusion of profile fitting and 
post-refinement, will reduce the number of patterns required for conver- 
gence. Lastly, continued development of the current, first-generation 
fast detectors will improve the hitherto limited accuracy of diffraction 
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Figure 4 | Data quality as a function of resolution and number of indexed 
patterns used to derive integrated intensities as shown by Rpt. The 


anomalous correlation coefficient CC,,, for the whole resolution range is 
indicated as well. 
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correlation between the respective maps and the final, refined 2mF, — DF- 
electron density (d) is indicated. All maps are contoured at 1.0 o. 


measurements and thus of the derived anomalous (and isomorphous) 
differences. All of these improvements are expected to reduce the 
number of patterns and thus sample quantity required. This will be 
especially important if the weak anomalous signal from endogenous 
sulphur is to be used’” (see Extended Data Fig. 2). 

Despite the current limitations of the experimental setup, we have 
shown here that X-ray free-electron lasers provide diffraction data 
from small crystals that are sufficiently accurate for de novo phasing. 
This proof-of-principle demonstration opens a unique and novel path 
to the determination of structures that elude traditional analysis. Such 
targets include radiation-sensitive samples such as the very small crys- 
tals that are often the only ones available for many macromolecules 
that are difficult to crystallize, in particular membrane proteins and 
their complexes. 


METHODS SUMMARY 


Microcrystals (=1 X <1 X <2 1m’) of hen egg-white lysozyme were grown as 
described earlier’ and transferred to 8% NaCl, 0.1 M sodium acetate buffer, pH 4.0. 
100mM gadoteridol'® (Gd?*:10-(2-hydroxypropyl)-1,4,7,10-tetraazacyclodode- 
cane-1,4,7-triacetic acid) was added to this storage solution and the crystals left 
to incubate at room temperature for at least 30 min before data collection. SFX 
diffraction data were collected using X-ray pulses of 50 fs duration (electron bunch 
length) and 8.5 keV photon energy of 2.6 mJ average power, essentially as described 
previously’. A suspension of 30% (v/v) of gadoteridol-treated lysozyme crystals in 
their soaking solution was injected into the 1 jm focus chamber of the CX] instru- 
ment”? at LCLS using a liquid jet’* of 3-4 1m diameter running at 30 plmin“’. 
Single-shot diffraction patterns were collected using a CSPAD detector at 120 Hz. 
All frames were saved, and protein diffraction patterns were identified using 
CASS” and indexed and integrated using CrystFEL’. Using data to 1.8 A resolu- 
tion, heavy atom binding sites were determined using SHELXD”' and refined 
using SHARP”. The resulting refined sites were used for likelihood-based SAD 
phasing using PHASER for Experimental Phasing” followed by density modifica- 
tion using DM”. Subsequently, automatic building by ARP/wARP resulted in 
correct main-chain tracing for over 60% of the structure*’. Iterative cycles of 
rebuilding and refinement resulted in a model at 2.1 A resolution with good stat- 
istics. A comparison data set using a single large lysozyme crystal soaked with 
gadoteridol was collected using a rotating anode, similarly as described previously”. 
A full description of the methods is available in Supplementary Information. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Extended Data Figure 1 | Anomalous signal strength of the SFX data (blue _ intensities (solid lines). The noise in the data is indicated in terms of Repiit for 
lines) as well as the rotating anode data (red lines) as measured by R,,,0n the SFX data and R,, m, for the rotating anode data (dashed lines). 
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Extended Data Figure 2 | Expected anomalous signal strength for a SAD 
experiment on lysozyme. Expected anomalous signal strength for a SAD 
experiment on lysozyme with 2 gadolinium atoms per protein molecule at 
8.5 keV (top panel) and for a sulphur-SAD experiment on lysozyme with 10 
sulphur atoms per protein molecule at 6.0 keV (bottom panel). In each case, an 


optimistic scenario with all anomalous scatterers ordered is shown (green line) 
as well as a pessimistic scenario in which 60% of the anomalous scatterers are 
ordered (blue line). This figure was prepared using the anomalous scattering 
web server at http://skuld.bmsc.washington.edu/scatter/AS_signal.html. 
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Extended Data Table 1 | Data collection, phasing and refinement statistics 
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Space group 
Cell dimensions*® 
a, b, c (A) 
a,B,y (°) 
Wavelength (A) 
Pulse energy/fluence at sample 
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Number of collected diffraction patterns 
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Number of indexed images patterns 
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LCLS SFX data 
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>Beamline transmission 50%; the resulting 1.3 mJ pulse energy was attenuated to 4% transmission. 


“Highest resolution shell is shown in parenthesis. 


Rotating anode data 
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“In SFX, all reflections are partials and no separate measurements of full reflections are recorded; Therefore, Rmerge defined as the normalized sum of the absolute differences between single observations of fully 
integrated reflections and their mean value, is meaningless. 
“For the SFX data, 59,667 crystals where used, whereas for the rotating anode data, a single crystal was used. 
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Standing out 


Welcoming lab environments and networking organizations 
help lesbian, gay, bisexual and transgender scientists to excel. 


BY CAMERON WALKER 


of New Jersey in Ewing, passers-by can 
see a rainbow pride flag in the window 
of chemist Benny Chan. He has not always 
been so open. Chan came out as gay to just a 
few people in graduate school, and although 
he did not hide his sexual orientation once he 


| ooking up froma courtyard at the College 


started working at the College of New Jersey, 
he decided not to be vocal about it until he met 
the requirements for tenure. His past advisers 
and the administration at his newjob were sup- 
portive, he says, but “there’s always that little 
bit of doubt in my head” — one uncomfort- 
able or discriminatory colleague could cause 
problems. 

Scientists do not always share their personal 


sides in the lab. Deciding whether to be open 
about one’s identity can be an acute issue for 
lesbian, gay, bisexual and transgender (LGBT) 
researchers. Unlike some other minorities, 
LGBT people “have the ability to conform, 
because it’s not always a visible trait’, Chan says. 
But hiding something as basic as sexual orien- 
tation or gender identity can be detrimental to 
mental health and work. “You need to spend a 
lot of extra energy if you feel like you need to 
hide a part of your life” says Chan. 

Researchers may have trouble finding col- 
leagues who share their experiences — which 
can be anything from overt or subtle discrimi- 
nation to complete comfort in the workplace. 
Many want to know how best to support 
younger LGBT scientists, who might not know 
where to turn for mentoring. 

As broader awareness of LGBT scientists 
grows and more of the science community 
starts to appreciate the issues that affect them — 
including same-sex marriage and anti-discrim- 
ination laws — groups are convening to foster 
a sense of community and, in some cases, to 
develop best-practice guidelines. These organi- 
zations aim to ensure that LGBT researchers get 
the support they need so that isolation does not 
keep them from being effective scientists. 


GATHERING DATA 

There is a growing body of research on women 
and ethnic minorities in science, but the 
number and experience of LGBT research- 
ers has been less widely studied. To address 
this, Jeremy Yoder, an evolutionary-biology 
postdoc at the University of Minnesota, Twin 
Cities, and Allison Mattheis, an educational 
researcher at California State University, Los 
Angeles, gathered more than 1,400 responses 
to the ‘Queer in STEM’ survey, which exam- 
ined sexual diversity among people working 
in science, technology, engineering and maths 
(STEM) and how their identities might affect 
their careers. Most were from the United 
States, but there were also responses from Can- 
ada, the United Kingdom and India, among 
other countries. 

Preliminary results suggest that partici- 
pants who rated their workplaces as safe and 
welcoming and whose employers supported 
LGBT-specific needs — such as health-care 
benefits for same-sex partners in the United 
States — were more likely to be open with their 
colleagues about their identities. However, 
Yoder and Mattheis found that where respond- 
ents lived made no difference to how ‘out’ they 
were to colleagues or students, even if the | 
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> researchers were in big cities or in regions 
thought to be LGBT-friendly. 

Yoder and Mattheis hope that their survey 
results, which they are in the process of writ- 
ing up for submission, will make other scien- 
tists more aware of and welcoming to LGBT 
researchers. The pair contend that when 
heterosexual researchers know about their 
colleagues’ identities, they are more likely to 
support policies such as partner benefits and 
expanding equal-opportunity employment to 
cover sexual orientation and gender identity. 
And the authors expect that more information 
about the community will encourage LGBT 
people to enter STEM. “By making queer folk 
working in STEM more visible, we can help 
prompt STEM workplaces, professional socie- 
ties and university departments to take LGBT- 
specific needs into consideration in policy,” 
Yoder says. 


CREATING SAFE SPACES 
LGBT researchers can turn toa growing num- 
ber of support and networking groups (see 
‘Safe meeting spaces’). Some groups are work- 
ing on best-practice guidelines to help aca- 
demic departments to deal better with LGBT 
issues. Elena Long, a postdoc in nuclear phys- 
ics at the University of New Hampshire in Dur- 
ham who started the LGBT+ Physicists group 
in 2009, has worked with colleagues to create 
a guide for physics departments. 

These guidelines range from changes that 
can be made quickly — such as using gender- 
neutral language in the classroom and lab or 


inviting LGBT speakers to campus — to those 
that require long-term, department-wide 
efforts, such as adding non-discrimination 
statements to job announcements and mak- 
ing diversity training 
available to faculty 
members and staff. 
Many institutions 
offer on-campus 
training about LGBT 
issues (often called 
‘safe zone’ training). 
This usually consists 
ofa several-hour ses- 
sion in which par- 
ticipants learn about 


resources for LGBT Youneed to 
students and the Sspendalotof 
community itself. ¢Xtrdenergy, 
They may receive if you feel like 
stickers that theycan youneedto hide 
place on their office apartof your 
doors to identify safe __ life.” 

spaces in which peo- — Benny Chan 


ple are welcome to 

discuss LGBT issues. Some institutions have 
diversity offices that run these programmes. 
Independent organizations such as the Diver- 
sity Trust, based in the United Kingdom, also 
offer training. 

Although a sticker may seem like a small 
effort, a study of the Safe Zone programme at 
Iowa State University in Ames in 2002 suggests 
that these programmes can improve the climate 
on campus by visibly affirming that the needs of 


LGBT EVENTS 


Safe meeting spaces 


Several organizations worldwide hold 
conferences and events specifically aimed 
at lesbian, gay, bisexual and transgender 
(LGBT) scientists. 

@ Since 2010, the US National Organization 
of Gay and Lesbian Scientists and 
Technical Professionals (NOGLSTP) has 

put on Out to Innovate, a biennial two-day 
career summit that includes a career fair, 
workshops and speakers. In 2014 it will be 
in Atlanta, Georgia, and will be co-hosted 
with Out in STEM, a national society 
supporting LGBT students. The summit will 
facilitate mentoring and include industry 
representatives and tours of local companies. 
@ The non-profit organization Out for Work 
in Washington DC, which assists LGBT 
students with career development, runs 
annual conferences. 

@ Ecologists have been attending an 
informal networking LGBT lunch at the 
Ecological Society of America meeting 
since the late 1990s. 


@ A group of geoscientists holds an 
independent dinner for LGBT researchers 
during the annual American Geophysical 
Union meeting in San Francisco, California. 
@ The UK Gay and Lesbian Association 

of Doctors and Dentists has an annual 
conference for students and holds 
educational events. It also facilitates 
networking. 

@ The Australian Lesbian Medical 
Association in South West Rocks supports 
lesbian doctors and medical students, 

and their partners. It offers social events, 
mentoring and an annual meeting. 

@ Workplace Pride in Amsterdam holds an 
annual conference aimed at improving the 
workplace for LGBT people. 

@ Sticks & Stones, a diversity-focused 
career fair that bills itself as Europe’s 
largest for LGBT and straight people, will 
be held in Berlin in 2014. Last year, several 
pharmaceutical and technology companies 
attended. €.W. 
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LGBT students are valid, and increasing hetero- 
sexuals’ awareness of both the LGBT commu- 
nity and their own biases (N. J Evans J. College 
Student Dev. 43, 522-539; 2002). 

The best-practices guide from LGBT+ 
Physicists also offers some measures to ease 
the path for transgender researchers. On a 
departmental level, simplifying the process 
of name changes on campus records — and 
indicating that changes will not affect some- 
one’s job, tenure or award applications — can 
be particularly meaningful for transgender 
people, who may have elected to transition to 
or have started identifying as their preferred 
gender during graduate studies. The CV can be 
a minefield: many transgender people “face an 
extremely difficult choice when applying to a 
new position’, says Long. “Either risk discrimi- 
nation by outing yourself as trans, or risk dis- 
crimination by leaving out a significant chunk 
of your past work under a different name.” 


FINDING COLLEAGUES 

Meeting researchers with similar backgrounds 
and concerns is becoming easier. LGBT 
researchers have been convening an informal 
networking dinner at meetings of the Ameri- 
can Astronomical Society (AAS) for more than 
20 years, but “you had to know it existed’, says 
Jane Rigby, an astrophysicist at NASA. 

After several members of this group wrote 
a charter, the AAS Council created an offi- 
cial working group on LGBTIQ Equality (the 
I stands for ‘intersex’ and the Q for ‘queer’ or 
‘questioning’). The group’s networking and 
other events now appear in the AAS confer- 
ence programme. The working group is also 
collaborating with LGBT+ Physicists on joint 
best-practice guidelines. 

There are also online physics and astronomy 
‘out lists, to which LGBT researchers have vol- 
untarily added their names and, in many cases, 
contact information so that they can be help- 
ful to others. Both lists also include non-LGBT 
researchers who support the community. Some 
institutions, such as the University of Califor- 
nia, San Francisco, have their own out lists. 

Many institutions have LGBT networks. 
At CERN, Europe's particle-physics lab near 
Geneva, Switzerland, an LGBT group hosts 
social events and weekly lunches in the cafete- 
ria to promote visibility, which is potentially 
helpful for LGBT visitors. 

Young LGBT scientists can find both com- 
munity and professional networking through 
mentoring. The US-based non-profit National 
Organization of Gay and Lesbian Scientists 
and Technical Professionals (NOGLSTP), 
which provided funding for the Queer in 
STEM survey, offers eight-month mentoring 
programmes for members. Through a partner- 
ship with MentorNet, an online STEM men- 
toring network, it matches undergraduate and 
graduate students, postdocs and other early- 
career professionals with mid- or later-career 
scientists in academia or industry. The goal 
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is to keep LGBT people in STEM careers 
and to provide someone for students to 
talk to if they feel that they cannot discuss 
their personal lives with their advisers, says 
Rochelle Diamond, who is the chair of the 
NOGLSTP’s board of directors and man- 
ages two labs at the California Institute of 
Technology (Caltech) in Pasadena. 


LGBT MOBILITY 

Changes to marriage laws in some coun- 
tries may influence acceptance of LGBT 
people in society at large, and improve the 
prospects of scientists looking for the right 
department fit (see Nature 454, 132-133; 
2008). In the United States, for example, 
there is still a patchwork of state laws that 
forbid same-sex marriage. But last June, the 
US Supreme Court declared that the section 
of the Defense of Marriage Act that prohib- 
ited federal recognition of same-sex mar- 
riage was unconstitutional. That may boost 
the immigration of LGBT scientists, who 
can now sponsor foreign-born spouses for 
permanent-resident status. It can also help 
US-based researchers and their spouses. 
Rigby and her wife and child are now on 
the same insurance plan; combined with 
other benefits that are now permitted, they 
may save several thousand dollars this year. 

At conferences, Carolyn Brinkworth, an 
astronomer at the Infrared Processing and 
Analysis Center at Caltech, wears a rain- 
bow sticker with the words ‘Safe Space’ or 
a badge from an LGBT youth organization 
for which she volunteers. Young scientists 
have approached her to say that they have 
not felt comfortable being out at work. “It's 
rare that they tell me the climate is hos- 
tile,” she says. More often, she says, these 
researchers do not want to think about 
introducing a potential new source of work 
stress by coming out, or are not sure how 
their advisers or peers will react to their 
identity. 

But Chan has found that being out 
proved better not only on a personal level, 
but also on a professional one. A volunteer 
for the American Chemical Society (ACS), 
Chan discussed being gay in an ACS pub- 
lication after his tenure decision. Later he 
received multiple e-mails from colleagues 
whom he knew from ACS meetings. Most 
were e-mails of support, but one colleague 
also asked him about his single-crystal 
X-ray diffractometer. The two have now 
collaborated on multiple papers. 

And at an LGBT reception at an ACS 
meeting, Chan also met a researcher who 
may host his sabbatical. “Being out has 
really helped me, he says. “It frees you 
up to think of your research, and your 
scholarship.” = 


Cameron Walker is a freelance writer 
based in Santa Barbara, California. 


TURNING POINT 


Eleni Antoniadou 


PhD student Eleni Antoniadou co-founded the 
London-based start-up Transplants Without 
Donors in 2009 to develop tissue-engineered 
organs. Antoniadou, who also blogs for The 
Huffington Post, was shortlisted in September 
in the science category of the 2013 Women of 
the Future Awards, Britain’s industry-funded 
search for successful early-career women. 


What led you to tissue engineering? 

I was working at a hospital as an undergradu- 
ate and saw that prosthetics had limitations. I 
wanted to do research that could give patients 
something better. I found regenerative medicine 
and tissue engineering to be promising fields. 


What was your first tissue-engineering project? 
While studying for a master’s in nanotechnol- 
ogy and regenerative medicine at University 
College London, I worked on neural genera- 
tion — testing biomaterials that could become 
artificial nerves. I also got involved in develop- 
ing a business plan for an artificial trachea. I felt 
overwhelmed when it was successfully received 
by a patient. It was proof that tissue engineering 
could be applied in clinical practice. 


So you launched the start-up soon afterwards? 
While in London, I joined several physicians 
and scientists to co-found Transplants Without 
Donors so that we could work on tissue-engi- 
neering scaffolds for several different organs. 
In launching this company, I came to appreci- 
ate the complexity of the science behind tissue 
engineering. In 2010, after receiving a scholar- 
ship from the Fulbright Program and the Insti- 
tute of International Education, I came to the 
University of Illinois at Urbana-Champaign to 
get a master’s in bioengineering, with a focus 
on developing artificial skin. This is challeng- 
ing, yet is a product that many patients need. 


What has been the start-up’s main challenge? 
Securing financial support. But it was also 
challenging to find people with the appropri- 
ate multidisciplinary background. We had to 
learn how to design experiments so that all the 
scientists on our 25-member team could con- 
tribute to and understand them. We are hoping 
that the products we launch next year — mostly 
tissue-engineering scaffolds and bioreactors 
for different organs — will be used by other 
researchers. Sharing products throughout labs 
could really help to move the field forward. 


You spent time at NASA recently. How did that 
influence your research? 
I was beginning a PhD at the University of 


a 


Illinois when the European Space Agency and 
NASA selected me to work at the biosciences 
division of NASA’ centre for nanotechology 
for several months. That was a turning point 
in my career: it was the most innovative place 
Id ever been. I saw the importance of tackling 
big, risky projects. 


How did you start blogging for The Huffington 
Post? 

After being nominated for the award, I was 
invited to write for the blog to raise awareness 
of the future of technology and of women in 
science. So far, I've written about the future of 
tissue-engineered organs and the importance 
of space exploration. Thanks to my posts, I’ve 
had scientists approach me to collaborate 
on projects and heard from people who are 
curious about tissue engineering. 


Name a pivotal moment in your career. 

In the past few years, I’ve been to Peru and 
Costa Rica to volunteer with the Foundation 
for the International Medical Relief of Chil- 
dren, a non-profit organization based in Phila- 
delphia, Pennsylvania, that sends out teams to 
perform operations or offer health care. We 
gave vaccinations and pharmaceuticals to sick 
kids, including those victimized by the illegal 
organ trade. It was really fulfilling and has 
helped to drive everything we do in the lab. 


What do you plan to do after you get your PhD? 
I would like to do research in the lab, working 
full-time at Transplants Without Donors to 
bring products to market. We need to develop 
a legislative framework for tissue-engineering 
products — one that will be universal. = 


INTERVIEW BY VIRGINIA GEWIN 
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Ua SCIENCE FICTION 


BY GRACE TANG 


cc kaasan, look!” The boy tugged on 
O his mother’s arm, pointing at the 
balding man splayed across the 

pavement. “That man is asleep on the street!” 

The stench of alcohol wafted up from the 
unconscious man’s body. While the boy’s 
mother politely chose to ignore the man’s 
transgression, Yuka openly stared. She gin- 
gerly picked her way over him and into the 
club from which he had emerged. 

She scanned the tables for her client as her 
eyes adjusted to the dark. Hiro was sitting 
in a private booth, a bottle of whisky in the 
centre of the table. 

“Hi” She slid into the booth to sit facing 
him. 

“T already told your boss, I’m not inter- 
ested in hostesses tonight” 

“Tm Yuka.” 

He looked up from his glass, one eyebrow 
raised. 

“You don't look the part.” 

“And that’s precisely why I'm so good at 
what I do” 

The left corner of his mouth raised slightly 
ina half smile. Given what her boss had told 
her of her client’s reputation, Yuka consid- 
ered this a minor victory. He reached into his 
jacket pocket and produced a phone, sliding 
it across the table. 

“In the photo album you will find geo- 
tagged, time-stamped pictures of your target. 
I trust that is all you need.” 

She pocketed the phone and rose from the 
booth. “Consider it done” 


The days were short this time of year, mean- 
ing the drinking started early, fuelled by the 
Western holiday season. Yuka tried to block 
out the drunken chorus of businessmen in 
the karaoke bar next door as she studied her 
target, looking for all the world like a teenage 
girl texting her friends. 

Her target appeared young, probably in 
his mid-twenties, somewhat good-looking. 
For most people, the deductions would stop 
there, but Yuka picked up on things most 
people would miss, like the characteristic 
bump of a concealed sidearm. 

The geo-tags were erratic, but mostly cen- 
tred in downtown Tokyo, disturbingly near 
areas where certain high-ranking Yakuza 
members had recently been found dead. 
Judging from the time stamps on the photos, 
he was active mostly at night. 

She was liking this job less and less. 


part human? 


OKAMI 


A matter of honour. 


She flipped through the rest of the pic- 
tures quickly, until one image made her do 
a double take. 

She pressed the back button and zoomed 
in on her target’s neck. She squinted long and 
hard until she had convinced herself without 
a shadow ofa doubt that 
her eyes were not play- 
ing tricks on her. 

“Son of ab—” 

She was going to need 
more than a handgun 
for this job. 


“Hiro,” Yuka struggled 
to control herself as she 
growled into the phone 
he had given her. “I 
want the fee doubled” 
“We had a deal.” 
“When we struck that 
deal, I thought I was 
dealing with a human? 
“He is technically still 


“Technically, he can 
smell my goddamn gun 
from two miles away!” 

“So, get a gun that can 
shoot farther than two miles. I’m sure your 
boss can spare you one.” 


Yuka was sweating bullets. She could handle 
any human, but this was the first time she 
had had to take down an Okami. 

She methodically set up the sniper on the 
ledge of the roof. It was half an hour to the 
new year, and the cheerful sounds of party- 
goers echoed from the apartments and 
streets below her. According to her boss's 
intel, her target would be exiting the oppo- 
site building about now, after completing his 
own hit. 

Minutes passed like hours. Yuka took her 
eyes off the sight for a second to glance at her 
wristwatch. Where was he? 


“Looking for me, I presume.” 

Yuka jumped, drawing her handgun from 
her waist at the same time as she spun round 
to face him. 

He was more impressive in the flesh, his 
movements graceful, his voice commanding. 

But most surprising 


SD NATURE.COM of all was his scent — 
Follow Futures: mostly canine, a hint of 
W @NatureFutures human, powerful and 
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“Youre not one of my hits. I won't harm 
you.” 

He seemed almost nonchalant about the 
fact that she was pointing her handgun at 
him, finger on the trigger. 

Yuka lowered her weapon slightly. 

“Ts this a trick?” 

“T wish.” He laughed, 
unsmiling. “Perhaps 
you are too young, or 
your masters treat you 
well. But I am tired of 
being used.” 

Her grip on her 
weapon tightened. “I 
have no masters. The 
people I work for take a 
cut of my fee, but what I 
earn is my own.” 

“Boss, master, what’s 
the difference? At the 
end of the day, we are all 
their dogs. Isn't that why 
we wear these collars?” 

For the first time 
tonight, Yuka was 
aware of the cool metal 
of the pendant she wore 
around her neck, one 
that identified people of her species, specifi- 
cally engineered for her profession. 

“As I was saying, I am tired. You may com- 
plete your job” 

She was shaking now, blinking furiously. 

“Sumimasen, Oniisan. I don't want to, 
but... [have my obligations...” 

“Yes, the loyalty of the wolf is why it was 
chosen to construct us. The only honourable 
way out of service to our masters is death” 

He bowed. “Sayonara?” 

He disappeared down the stairwell. Wiping 
her eyes, Yuka readjusted her sniper, gazing 
through the night-vision-enabled sight. 


The clock struck midnight, and fireworks 
lit up the sky, accompanied by thunderous 
booms and the stench of gunpowder. 

“Okaasan! Look!” The boy tugged at his 
mother’s sleeve and pointed. She smiled and 
held him tighter in her arms, their faces lit up 
by the showers of shimmering sparks. 

No one paid any attention to the young, 
well-dressed man, lying prone in the mid- 
dle of the pavement, as they picked their feet 
carefully around him. = 
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